TRANSACTIONS 


OF THE 


AMERICAN MATHEMATICAL SOCIETY 


EDITED BY 


ROBERT D. CARMICHAEL 
FRANCIS R. SHARPE 


JACOB D. TAMARKIN 


WITH THE COOPERATION OF 


ERIC T. BELL EDWARD W. CHITTENDEN WILLIAM C. GRAUSTEIN 
OLIVE C. HAZLETT EINAR HILLE AUBREY J. KEMPNER 
JOHN R. KLINE ERNEST P. LANE CHARLES N. MOORE 
MARSTON MORSE GEORGE Y. RAINICH JOSEPH F. RITT 
CAROLINE E. SEELY CHARLES H. SISAM MARSHALL H. STONE 


VOLUME 35 
1933 


PUBLISHED BY THE SOCIETY 
MENASHA, WIS., AND NEW YORK 
1933 


SOLLEUE 


COMPOSED, PRINTED AND BOUND BY 
The Collegiate Press 
GEORGE BANTA PUBLISHING COMPANY 
MENASHA, WISCONSIN 


AASO*4 


F 


TABLE OF CONTENTS 


VOLUME 35, 1933 


Apams, C. R., and CLiarxson, J. A., of Providence, R. I. On definitions of 
bounded variation for functions of two variables . 
AGNEW, R. P., of Providence, R. I. On Riesz and Cesaro methods of s sum- 
mability 
ALBERT, A. A., of Chicago, Il. Non-cyclic algebras of degree and exponent 
Cyclic fields of degree eight 
BECKENBACH, E. F., and Rapé, T., of Columbus, Ohio. Subharmonic fune- 
tions and minimal surfaces : 
Subharmonic functions and surfaces of negative ‘curvature 
BELL, E. T., of Pasadena, Calif. The Latin square, or cyclic, functions 
Polynomial diophantine systems 
Buss, G. A., and HESTENES, M. R., of Chicago, Il. Sufficient conditions for 
a problem of Mayer in the calculus of variations . 
Braunana, H. R., of Urbana, Ill. Groups {S, T} whose commutator sub- 
groups are abelian . 
Buttock, R. C., of Jackson, Tenn. Nou-conjugate coculatiog quadrics ofa a 
curve on a surface . 
Carrns, S. S., of Bethlehem, Pa. An axiomatic heels for plane geometry 
Car itz, L. a Durham, N. C. On abelian fields 
On the representation of a polynomial in a Galois field as the sum of 
an even number of squares 
CarMICHAEL, R. D., of Urbana, IIl. Systems of linear difference equations 
and expansions in series of exponential functions . : 
CxiaRksON, J. A., and Apams, C. R., of Providence, R. I. On definitions of 
bounded variation for functions of two variables . 
Currier, A. E., of Cambridge, Mass. Proof of the fundamental theorems 
on sommnd-enien cross partial derivatives 
Dramonp, A. H., of Berkeley, Calif. The complete existential theory of the 
Whitehead- Huntington set of postulates for the algebra of logic 
Doos, J. L., of New York, N. Y. The boundary values of analytic func- 
tions. IT. 
GERGEN, J. J., of Cambridge, Mass. Convergence criteria for double Fourier 
series 
GRAVES, L. M., of Chicago, ‘Ill. A transformation of the problem of La- 
grange in the calculus of variations . : 
GrirrFin, M., of Durham, N. C. Invariants of Pfaffian systems 
GrovE, V. G. , of East Lansing, Mich. Contributions to the theory of trane- 
formations ‘of nets in a space S, 
HEsTENES, M. R., of Chicago, Ill. Sufficient conditions for the general prob- 
lem of Mayer with variable end points . te 


PAGE 


824 


532 


112 
949 


648 
662 
734 
903 
305 
386 


518 
234 


— 
122 
397 
1 
824 
245 
940 
418 
29 
675 
929 
683 
479 


TABLE OF CONTENTS 


HEsTENES, M. R., and Buss, G. A., of Chicago, Ill. Sufficient conditions 
for a problem of Mayer in the calcules of variations . ‘ ‘ 
Hottcrort, T. R., of Aurora, N. Y. The general web of algebraic surfaces of 
order » and the involution defined by it ; ; 
HuntIincTon, E. V., of Cambridge, Mass. New sets of independent postu- 
lates for the algebra of logic, with special reference to Whitehead and 
Russell’s Principia Mathematica . 
Boolean algebra. A correction . 
A second correction . : 
Jerrery, R. L., of Wolfville, Nova Scotia. Sets of b-extent in =-dimensional 
space 
Latimer, C. G. of Lexington, Ky. On the class number of a ‘cyclic field 
LerscHerz, s., and WairtEHEAD, J. H. C., of Princeton, N. J. On analytical 
complexes 
Lev, J., of Ithaca, N. y. Effects of linear transformations on the divergence 
of bounded sequences and functions 
Lewis, D. C., of Cambridge, Mass. Infinite systems of ordinary differential 
equations with applications to certain second-order partial differential 
equations 
McCoy, N. H., of Northampton, Mass. On the resultant of a , system of 
forms homogencous i in each of several sets of variables . : 
McSuane, E. J., of Chicago, Ill. Parametrizations of saddle surfaces, with 
application to the problem of Plateau 
MAnNninG, W. A., of Palo Alto, Calif. The and clees of multiply 
sitive groups, III. 
Miter, G. A., of Urbana, Il. in which every operator has at most 
a prime number of conjugates 
MonrtcomEry, D., of Iowa City, Iowa. Sections of point sets 
Myers, S. B., of Cambridge, Mass. Sufficient conditions in the problem of 
the calculus of variations in m-space in parametric form and under 
general end conditions. 
Ore, O., of New Haven, Conn. Ona 5 special class of polyncmials 
PALEY, R. E. A. C., of Cambridge, Mass. A special integral function 
Patey, R. E. A. C., and Wiener, N., of Cambridge, Mass. Notes on the 
theory and application of Fourier translorms. I-II 
Notes on the theory and application of Fourier transforms. 10, 
IV, V, VI, VII . 
PALL, G., of Montreal, Canada. The strecture of the number of representa- 
tions function | in a binary quadratic form . 
Rapé, T., of Columbus, Ohio. An iterative process in the problem of 
Plateau ‘ 
Rapé, T., and BECKENBACH, F. ‘of Columbus, Ohio. Subharmonic func- 
tions end minimal surfaces 


PAGE 
305 
855 
274 
557 
971 


629 
411 


510 


888 


792 
215 
716 
585 
897 


915 


746 
559 
709 
348 
761 
491 
869 


648 


iv 


TABLE OF CONTENTS 


Subharmonic functions and surfaces of negativecurvature . 
Rots, W. E., of Milwaukee, Wis. On the equation P(A, X)=0 in 
Saks, S., of Warsaw, Poland. On some functionals . he a 
Addition to the note on some functionals . 
SCHERBERG, M. G., of Minneapolis, Minn. The degree of convergence ofs a 


series of Bessel 

SCHOENBERG, I. J., of Chicago, Iil. On Snite-rowed systems of linear i in- 
equalities in infinitely many variables. II 

SHEFFER, I. M., of State College, Pa. On the properties of polynomials 
satisfying a Unser differential equation: Part I. ‘ 

SHERMAN, J., of Philadelphia, Pa. On the numerators of the convergents of 
the Stieltjes continued fractions 

SINGER, J., of Princeton, N. J: Three-dimensional manifolds and their 
Heegaard diagrams 

Sinxov, A., of Washington, D. on ‘Families of groups generated by two 
operntors of the same order 

SNYDER, V., of Ithaca, N. Y. On a series of involutorial Cremona trans- 
formations of space defined by a pencil of ruled surfaces 

Su.uivay, M. M., of Cambridge, Mass. On the derivatives of Newtonian 
and logarithmic potentials near the acting masses 

Tuomas, J. M., of Durham, N. C. Pfaffian systems of species one. 

Warp, M., of Pasadens, Calif. The cancellation law in the theory of con- 
gremens to a double modulus ‘ 

The arithmetical theory of linear recurring series 

WHITEHEAD, J. H. C., and LEFscHeETz, S., of Princeton, N. J. On analytical 
complexes 

Watney, H., of Princeton, N. J. A characterization of the closed 2-cell 

WIENER, N., and PaLey, R. E. A. C., of Cambridge, Mass. Notes on the 
theory and application of Fourier trsnsiorme. I-II 

Notes on the theory and application of Fourier transforms. Hl, 

IV, V, VI, VII . 

WIENER, N., of Cambridge, Mass., and Youne, R. C., of Cambridge, 
England. The total variation of e(2+h)— g(x) , 

Youn, R. C., of Cambridge, England, and Wiener, N., of Cambridge, 
Mass. The total variation of g(x+h) — g(x) Sad 

ZIPPIN, L., of Princeton, N. J. Correction to a paper on the Moore- Kline 
problem: 


Vv 
PAGE 
662 
689 
549 
965 
172 
452 
184 
64 
88 
372 
341 
137 
356 
254 
600 
510 
261 
348 
761 
327 
327 
972 


SYSTEMS OF LINEAR DIFFERENCE EQUATIONS AND 
EXPANSIONS IN SERIES OF EXPONENTIAL 
FUNCTIONS* 


BY 
R. D. CARMICHAEL 


Introduction. The principal purpose of the first part of this paper is to 
prove (§1.9) that the system (1.1) of linear non-homogeneous generalized 
difference equations has solutions g;(x), k=1, 2, - - - , m, which are integral 
functions provided that the independent terms ¢,(x) are themselves integral 
functions and provided that the system has a certain non-singular character 
defined in §1.3. In case the ¢,(x) are further restricted to be of exponential 
type (§1.5) then solutions of exponential type exist (§1.6) and indeed solu- 
tions of exponential type at most equal to q (called principal solutions) in 
case no ¢,(x) is of higher type than g and at least one of them is of precisely 
this type. A useful symbolic notation (§1.2) is effective in carrying out the 
argument. 

In the second part of the paper we apply the results of the first part to the 
rather remarkable problem of the simultaneous expansion of m integral func- 
tions in composite power series, a problem which we have not seen treated 
elsewhere. 

The third part of the paper is devoted to the theory of a class of remark- 
able expansions in series of exponential functions, generalizing the theory of 
Fourier series. Whereas the basic region of convergence of Fourier series is a 
segment of a straight line, these new series, apart from certain particular 
cases, have certain polygons in the complex plane as their basic regions of 
convergence. The vertices of these polygons play the réle of the end points of 
the segments in the case of Fourier series, while the remaining points of the 
polygon play the réle of interior points of the segments. Several extensions of 
the theory are briefly indicated (§3.4) and an application is made (§3.5) to 
the expansion of Bernoulli polynomials of higher order in series of exponential 
functions. 


I. ON A SYSTEM OF LINEAR DIFFERENCE EQUATIONS WITH 
CONSTANT COEFFICIENTS 


1.1. Formulation of the problem. We consider the problem of solving the 
system 
* Presented to the Society, August 3i, 1932; received by the editors June 27, 1932. 
1 


BUS 


f ala 
< 


R. D. CARMICHAEL [January 


(1.1) Levit + = ¢,(x) 

of functional equations (generalized linear difference equations with constant 
coefficients), where the functions ¢,(x), y=1, 2, - - - , m, are m given integral 
functions and the m functions g;(x), 7=1, 2,-- +, m, are to be determined 
subject to the requirement that they shall be integral functions. In this 
system the coefficients c,; and the additive terms a,; in the arguments are 
given constants; in §1.3 we shall subject these constants to a certain negative 
condition in order to avoid exceptional cases in the theory of the system. 

We shall sometimes subject the ¢,(x) to additional restrictions and in 
such cases we shall put like further restrictions upon the solutions g;,(x), 
thus obtaining what may be called principal solutions of the given system. 

The theory of system (1.1) contains that of the single equation 


(1.2) + = G(2) 


where ai, a2, are different constants, 71, Y2, , are constants 
different from 0, G(x) is an integral function, and F(x) is to be determined 
as an integral function. To see this it is sufficient to write g,(x) =F (x+a;) 
and to form the system 
gi(x a1) ax) 0, k= 2, 3, My G(x). 
k=1 
This system is of the form (1.1). From a solution of this system we have a 
solution of equation (1.2); and vice versa. 
In a similar way one may reduce the problem of solving a system gen- 
eralizing (1.1) and (1.2) at the same time to the problem of solving a system 


of the same form as (1.1). 
Special cases of the problem here set have been treated by various 


authors.* 
1.2. Introduction of symbolic operators. We define the symbolic operator 


E(a) by the relation 

(1.3) E(a)-f(x) = f(x + a). 

A linear homogeneous combination of such operators will have the meaning 
indicated by the relation 


* See, for instance, C. Guichard, Annales de |’Ecole Normale Supérieure, (3), vol. 4 (1887), 
pp. 361-380; A. Hurwitz, Acta Mathematica, vol. 20 (1897), pp. 285-312; S. Pincherle, Ibid., vol. 
48 (1926), pp. 279-304 (first published in 1888); R. D. Carmichael, American Journal of Mathe- 
matics, vol. 35 (1913), pp. 163-182; E. Hilb, Mathematische Annalen, vol. 85 (1922), pp. 89-98. 


2 
n 
k=l 


1933] SYSTEMS OF LINEAR DIFFERENCE EQUATIONS 


(1.4) = +o). 

kel kal 
This will serve, in particular, to define the sum and the difference of two 
operators of the form aE(a) and BE(b). The product of these two operators 
is defined by the formula 


(1.5) aE(a)-BE(b) = aBE(a + 5). 


These definitions serve to give a unique meaning to any polynomial com- 
bination of operators of the form E(a;), the coefficients being constants. Such 
a polynomial in operators Z may be written as a linear function of suitably 
defined operators E, as one sees by aid of (1.5). In particular, one may define 
such an operator by means of a symbolic determinant of the form 


(1.6) A =| ¢,;E(a,;) | 


whose element in vth row and jth column is ¢,;E(a,;), this being (by definition) 
the symbolic operator obtained by expanding the determinant formally as if 
its elements were ordinary algebraic quantities. The expanded determinant 
may be written as a linear homogeneous function of suitable operators E with 
constant coefficients. 

Any polynomial combination of such operators E will be said to have the 
value zero when and only when the result of operating with it upon an arbi- 
trary integral function gives the value zero identically. It is easily shown that 
such a polynomial in operators £ is zero if and only if the function e*' of x 
is reduced to zero for all ¢ when operated upon by the named operator. 

1.3. Symbolic form of (1.1); restriction on the system. Employing the 
symbolic operators introduced in §1.2 we write system (1.1) in the form 


j=1 


The determinant A in (1.6) will be called the symbolic determinant of system 
(1.7). This determinant will be called singular when it has the value zero; 
otherwise it will be called non-singular. 

We shall treat system (1.1) or (1.7) only in the case when its determinant 
is non-singular. In that case we shall say that the system is non-singular. For 
the treatment of the excluded exceptional case the methods required are 
quite different from those here employed. 

We shall use (without further definition) the terms customarily employed 
in the theory of determinants. 


3 


R. D. CARMICHAEL [January 


1.4. Separation of variables. Let A,; denote the cofactor of the element 
in the vth row and jth column of A. Then A,; is a polynomial in operators E 
with constant coefficients. Moreover, we have 


j=l 
where 6,, is 1 or 0 according as y =k or v¥k. 
Multiplying the vth equation in (1.7) by the operator A,,, summing as to 
v from 1 to m, interchanging the order of summations in the first member of 
the resulting equation and simplifying by aid of the second equation in (1.8), 
we have 


In these m equations the unknown functions g;(x) appear singly. Any solution 
of (1.7) must satisfy (1.9). 
The operator A may be written in the form 


(1 10) A= E(ax) 
k=O 

where the c, are constants different from zero and the a; are different con- 
stants (this form being surely possible since A is non-singular). The value of 
o depends on m and the constants c,; and a,;; it is never greater than m!—1. 

If o=0 the required inverse operator A-! is E(—4ao)/co. In this case a 
complete solution is readily obtained and the problem is trivial. 

If o>0, as we shall henceforth assume it to be, we may write each equa- 
tion (1.9) in the form 


(1.11) cog(x + ao) + cig(x + a1) + + Cog(x + a.) = (x), 


where ¢(x) is a given function and g(x) is to be determined. From the solution 
of such an equation as this we shall pass to the solution of system (1.7). 

We shall find it convenient to employ the function /(¢) defined by the 
equation 


(1.12) h(t) = 


Since the given system is non-singular it follows that h(¢) is not identically 
zero. 

We have seen that every solution of (1.7) is a solution of (1.9). But the 
converse does not hold, as we shall now show. If gi(x), k=1, 2,---,m, is 
any solution of (1.9) then its general solution is g.(x)+:(x), k=1,2,---,m, 


4 
n n 
n 


1933] SYSTEMS OF LINEAR DIFFERENCE EQUATIONS 5 


where the functions p;(x) are arbitrary functions satisfying the equation 
Ap:(x) =0. Now consider the system 


+ 1) + go(x) + ga(x +2) =di(x), gi(x + 2) + go(x) + ga(x + 1) = o2(x), 
&3(x) = o3(x). 


Here we have A = E(1) — E(2) #0, whereas the function g;(x) is uniquely de- 
termined by the last equation in the system. 

From this example it follows that it is necessary to obtain an appropriate 
solution of (1.9) in order to have a solution of (1.7). The problem falls 
naturally into two cases; the following section prepares the way for this 
separation of cases. 

1.5. Functions of exponential type. If f(x) is an analytic function which 
is regular at x» and x, then it is easily shown that 


lim sup | | = lim sup | | 1”, 


where the superscripts denote derivatives with respect to x. If these superior 
limits have the finite value g (¢20) then f(x) is an integral function; in such 
a case we shall say that f(x) is of exponential type g, this terminology being 
justified by the following theorem,* stated here without proof: 


THEOREM 1.1. A necessary and sufficient condition that the function f(x) 
shall be of exponential type q is (1) that numbers r shall exist for which it is true 
that for every positive number ¢ there exists a quantity M, depending on ¢€ and r 
in general but independent of x, such that for all (finite) values of x, we have 


| f(x) | < Me 
and (2) that q shall be the least possible value for such numbers r. Moreover, when 
f(x) is of exponential type q, we have 
| f(x) | < M(q + lz! (v = 0,1, 2,---), 
where M is independent of x and v. 


1.6. Case when the ¢,(x) are of exponential type. We first carry out the 
solution of (1.7) for the case when the known functions ¢,(x) are of ex- 
ponential type not exceeding g. Taking their power series expansions in the 
form 


(1.13) = (» = 1,2,---,m), 
k=O 


* For a proof of this theorem and for further properties of functions of exponential type gq, to- 
gether with references to the literature, see a forthcoming paper of mine in Annals of Mathematics. 


6 R. CARMICHAEL [January 


we introduce the functions y,(¢) by means of the expansions 


(1 14) v,(t) = + Sy2 

; 
Then the series in (1.14) all converge if |¢| >g. Let r be a positive number 
exceeding g such that the circle C, of radius r about 0 as a center passes 
through no zero of the function /(#) defined in (1.12). (This negative condition 
on 7 is first needed in the next paragraph.) Then we have 


(1.15) o(x) = 


as one sees by using expansions (1.14) and integrating term by term in (1.15). 

We employ the operator A,; with the meaning given in §1.4. By A,,e*' we 
mean the result of operating with A,; on e** considered as a function of x. 
Now write 


1 

(1.16) gi(x) = ok G 1, 2, n), 
C, k=l ) 

this being suggested by the problem of solving (1.9) by the method employed 
by Pincherle (loc. cit.) for a similar equation. Now substitute in the first 
member of (1.7) the functions g;(x) so defined and simplify by aid of (1.8) and 
(1.15); thus we have 

1 


j=l Cy, kell jel 


= 


C, bal 


1 
ewhp,(t)dt ¢,(x). 
2 C, 


Therefore the functions g;(x) defined by (1.16) afford a solution of (1.7). 
Equation (1.16) may be written in the form 


1 


Thence it follows that a constant M exists such that 
| g(0)| < Mr-r’. 


Therefore g;(x) is an integral function of exponential type not exceeding r. 


1933] SYSTEMS OF LINEAR DIFFERENCE EQUATIONS 7 


Suppose next that r is so chosen that there is no zero of /(¢) in the interior 
of the circular ring bounded by C, and the circle |#] =¢. Then g;(x) remains 
unaltered as r decreases towards g remaining greater than g. Therefore, in 
this case, the function g;(x) is of exponential type not greater than g. Further- 
more, if in this case at least one of the functions ¢,(x), v=1, 2, - - - , 2, is of 
exponential type g (none being of higher type), then at least one of the func- 
tions g;(x),7=1, 2, - - - ,m, is of exponential type g and none is of higher type. 

If the functions ¢,(«) are of exponential type q or less and at least one of 
them is of type g then a solution of (1.7) will be called a principal solution if 
no function in it is of exponential type exceeding g. We have just shown the 
existence of such principal solutions. To determine all principal solutions we 
have to find all solutions, of exponential type not exceeding g, of the homo- 
geneous system corresponding to (1.7). This problem is left for a later investi- 
gation. 

The main result in this section may be stated as in the following theorem: 


THEOREM 1.2. When the ¢,(x) are functions of exponential type not greater 
than q and one at least of them is of type q, then the non-singular system (1.1) 
or (1.7) admits as a principal solution the functions g;(x) defined by (1.16) for 
r=q-+e, where € is a small positive quantity such that h(t) has no zero in the 
ring bounded by C, and the circle | t| =q. 


1.7. Lemmas concerning exponential sums. Equations (1.9) are of the 
form (1.11). Replacing x in (1.11) by x—«a» we have another equation of the 
same form in which a)=0. Hence there is no loss of generality in taking 
ao =0; and this we do. Then the function A(é) in (1.12) has the form 


(1.17) h(t) = co + cye™* + coe? + 


In preparation for the treatment of the case when the functions ¢,(x) are 
general integral functions we state certain lemmas concerning the function 
hit). 

We shall first determine certain infinite regions in which h(?) is free of 
zeros and in fact is bounded away from zero. Separating ¢ and the a; into real 
and imaginary parts we write 


t= ut iv, a, = ax + 
Let ), denote the line 
R(at) = R(a,t), 
where R(z) is the real part of z. Then ,, and J,, denote the same line. Moreover 


}, and I,, coincide if a,—a,=c(a,—a,) where c is a real number; otherwise 
they do not coincide. Let s denote the number of distinct lines in the set h,. , 


8 R. D. CARMICHAEL * [January 


Since each of these s lines passes through zero they divide the plane into 2s 
sectors such that no point of any one of these lines is in the interior of any 
such sector. 

Let S; be any one of these 2s sectors and let é be any given interior point 
of S;. Then no two of the quantities R(ait:), k=0, 1, - - - , 7, are equal. Let 
them be arranged in order of descending magnitude, thus: 


R( > R(ax,t1) 


Then if é, varies continuously over the interior of S, this continued inequality 
will be preserved, since each member varies continuously and no two become 
equal for an interior point of S,. One and just one term of this continued 
inequality is zero for an interior point 4, of S:. Hence the first term is not 
negative. 

In the sector S; take a point P which is at a distance 6 from each of the 
bounding rays of the sector, where 4 is a positive quantity whose value is to be 
assigned later. From P draw rays to infinity in S, and parallel to the bounding 
rays of S,, thus forming a new sector S interior to S. 

Let (u, v) be any point in S. Then the distance from (u, v) to the line 
1.x, is the positive quantity 


(ak, ay, )u = 


{ (an, — + (Br, — Be,)?} 


But this distance is not less than 5, whether /,,., is or is not a bounding line 
of S;. Therefore if ¢ denotes the point (u, v) we have 


— R(axt) = (ax, — — (Bey — 

= 5{ (ax, — a,)? + (Be, — Bx,)?} "2. 

Now let 7 be a fixed quantity such that 
lel (r=0,1,---,0). 
Let m be the least value attained by the left member as 7 varies over the set 
0, 1,---+, Determine 6 so that 
5{ (on — ay)? + (Bx — B,)?}4? 0, AH 

for every pair of different numbers \ and yu from the set 0, 1, - - - , o. Then 
R(ax,t) — R(ax,t) S —7 for all in S. Hence R(a,t) S for all in S 


and for all & in the set 0,1, - - - , a except k=£o. 
From these inequalities and the fact that R(a;,/) =0 in S it follows readily 


that for all ¢ in S we have 
(1.18) | >m>0. 


$ 


1933] SYSTEMS OF LINEAR DIFFERENCE EQUATIONS 9 


This inequality is independent of the particular sector S; hence it holds for 
all sectors S formed (in the way indicated) by aid of a 6 satisfying the named 
condition. We therefore have the following lemma: 


Lemma 1.1. Im the sectors S formed as indicated the function h(t) satisfies 
inequality (1.18). 

When the sectors S are cut out of the plane there is left a sort of infinite 
star in which lie all the zeros of A(#) and in fact all the points ¢ for which 
| h(t)| <m. Thus we see that h(t) is bounded away from zero in the distant 
part of the plane except possibly for certain regions in the star-arms remain- 
ing after removing the sectors S. We next consider the problem of bounding 
h(t) away from zero in certain parts of these star-arms. 

By a rotation of the ¢-plane, obtained by replacing ¢ by e*¢ where @ is 
real, any particular bounding ray of any sector S; may be transformed to the 
positive part of the w-axis. Since this transformation leaves invariant the sort 
of result we are to establish we may (and we shall) temporarily suppose that 
this transformation has already been carried out; for convenience we retain 
the original notation. The star-arm to be considered will then lie along the 
positive real axis; we denote it by A. Then the real axis is a line h,, and we 
have =a, while 

The maximum a; is positive or zero, since a) =0. If the maximum value 
a of the a; is the value of just one of them, then as ¢ becomes infinite in A, the 


function | (¢)| becomes infinite or approaches a finite limit different from 
zero according as a is positive or zero. In this case h(#) is bounded away from 
zero in the distant part of the star-arm. 

In what remains we may therefore suppose that the maximum value a 
of the a; is the value of two or more of them. Now in the star-arm A we have 


| en) | S| 


the sign of equality holding when and only when a=0. But e*# and its 
reciprocal are bounded in absolute value in A. It follows therefore that it is 
sufficient to treat only the special case in which a=0, as may be seen by 
replacing h(¢) by a suitable e~«+#»'h(#), Therefore we takea=0. We tem- 
porarily choose the notation so that the values of k for which a,=0 are 
k=0,1,---,y-1. 

Write 

(1.19) hy(t) = iv) , 


k=O 


Then h(t) —4,(¢) approaches zero as ¢ becomes infinite in A. It is therefore 


10 R. D. CARMICHAEL [January 


sufficient to our purpose to determine suitable parts of A in which A(é) is 
different from zero and /;,(#) is bounded away from zero. 

For this investigation we need the following classical lemma which 
we state without proof: 


Lema 1.2. If bi, be, - - - , b, is any set of real numbers, all different from 
zero, and if 5 is any preassigned positive number, then there is an infinitude of 
positive integers m such that, for each such m, integers ki, ko, - - - , ky exist such 
that 


(1.20) | -+m| <6 (Gj =1,2,---,»). 


If all such positive integers m are denoted by the symbols m,, me, - - - , with 
m;<Mj41,7=1, 2, - - - , then among the differences m;41 —m; there is a greatest 
one. 


Applying this lemma to the case when 6;=1/8; and y=y—1, we have 
| S08; = 1,2,---,¥—-1). 


Thence it follows that for every preassigned positive ¢ there exists a 6 such 
that we now have 


| hy(t + 2mr) — hy(t) | > | — 1) | <e 


k=0 


for all in the star-arm A. Let R be a rectangle two sides of which are on the 
boundaries of A and let it be subject to the condition that /,(¢) does not 
vanish in R. Let € be such that | /,(¢)| >2¢€in R. Then 


| h(t + 2mm)| > 


when / is in R and m is an integer admitted by the foregoing lemma. 

We now return to the original form of h(#) as given in (1.17). On each arm 
of the star associated with h(t) we now take a rectangle R obtained from the 
foregoing one by reversing the rotation by which the corresponding arm is 
put in the special position employed in the preceding argument; or, we take 
any rectangle R on the arm and in which /(¢) does not vanish, in case the 
situation is such that the preceding argument reaches the goal before the 
introduction of Lemma 1.2 and the rectangle R. Then h(¢) is bounded away 
from 0 on Rand on all congruent rectangles (except a finite number at most) 
similar to those in the preceding paragraph and containing the points ¢+2m7 
with ¢ on R and m determined as in the lemma or m sufficiently large when 
the lemma is not needed. 


| 


1933] SYSTEMS OF LINEAR DIFFERENCE EQUATIONS 11 


A part of the foregoing results may be stated in the following lemma:* 


Lema 1.3. There exists a positive number € such that | h(t)| > for all large 
t in sectors S and for all large t in rectangles R or rectangles obtained from them 
by the translations t'=t+2mm where t is in R and where m is an integer ad- 
mitted by Lemma 1.2 for the star-arm in question or m is any sufficiently large 
integer in the cases where Lemma 1.2 is not employed in the argument. 


For use in integrations later to be performed let us define a set of contours 
T;, - - - , passing through no zero of A(t), such that 0 is interior to 
while I; is interior to ';,: and such that for r greater than some preassigned 
number the distance from 0 to a point of I, is not less than r and not greater 
than r+8 where £ is a sufficiently large given positive number, each contour 
having the property that it consists of circular arcs (with 0 as center) in the 
sectors S and segments of the boundaries of the star-arms and straight line 
segments crossing these arms in the rectangles R or such rectangles congruent 
to them as are admitted by Lemma 1.3 and the preceding discussion. Then 
the length of I’, bears a bounded ratio to 2zr. 

From Lemma 1.3 we then have the following: 


Lema 1.4. There exists a positive number ¢ such that | h(t)| > for every t 
on every contour Ts, - 


From the distribution of the numbers m; as described in Lemma 1.2, it 
follows that the contours T;, I's, - - - may be further restricted so that there 
exists a number # such that no more than # of the contours cross a given 
rectangle congruent to a given rectangle R in accordance with Lemma 1.3. 

1.8. Solution of equation (1.11). In equation (1.11) we take a)>=0, as we 
may do without loss of generality. We now propose to show that, when 
¢(x) is any given integral function, this equation has a solution g(x) which is 
itself an integral function. 

We denote by G,(x) the polynomial which satisfies the equation 


+ + ay) = x” 


k=1 


and is (sometimes more precisely) defined by the formula 


n! dt 


Qridc h(t) tt 
* For such results as those in Lemmas 1.3 and 1.4 see the address of R, E. Langer, Bulletin of 
the American Mathematical Society, vol. 37 (1931), pp. 213-239, and the papers there cited, especially 
those of J. D. Tamarkin. 


12 R. D. CARMICHAEL ° [January 


where 1 is a positive integer or zero, /(¢) denotes the function defined in (1.17) 
and C is a contour inclosing the point 0 and no singularity of the integrand 
other than ¢=0. 
Let I’,, where r is any positive integer, denote the contour represented by 
this symbol in the latter part of §1.7. Form the function 
n! dt 


= — 
2riJr, h(t) 


This function satisfies the equation 
+ (x + ax) = x". 
k=l 


Let x be now confined to any preassigned finite region T of the x-plane. 
Then we have 


! 
f 
r, 


where M, is a dominant of | 1/A(¢)| for all ¢ on all contours I’,, the existence 
of this dominant being assured by Lemma 1.4. From the character of the 
contours I’,, as described in the latter part of §1.7, we now see that a con- 
stant M (independent of x and m and r) exists such that for all x in T we have 


(1.21) | Gur(x) | < 


where p is such that p>e!*! for all x in T. 
Write the power series expansion of ¢(x) in the form 


(1.22) ¢(x) = 
Form the function g(x), 
(1.23) g(x) = ENG. 
Then the (v+1)th term of the series here written is, in the region T and for 
sufficiently large values of v, less in absolute value than the quantity 


M | r, | 


As v becomes infinite the superior limit of the vth root of this quantity is zero 
since |\,|"” has the superior limit zero owing to the fact that (x) is an in- 
tegral function. Therefore the series in (1.23) converges absolutely and uni- 


js 
q 
4 


1933] SYSTEMS OF LINEAR DIFFERENCE EQUATIONS 13 


formly in any whatever preassigned finite region T. Since each term of this 
series is analytic throughout the finite plane it follows that g(x) is itself an 
integral function. 
It is readily verified by a direct substitution and a use of the named prop- 
erties of G,,,(x) that this function g(x) satisfies equation (1.11) with a)=0. 
We are thus led to the following theorem: 


THEOREM 1.3. If @(x) denotes the integral function defined in (1.22) then 
the series 


(1.24) +(x) 


is for suitable values of r absolutely and uniformly convergent in every finite 
region of the complex plane (the value r =v being always suitable) and defines a 
sum function g(x) which is an integral function of x and satisfies equation (1.11) 
with ao=0. 


If ¢(x) is further restricted to be of exponential type g then it is easy to 
show (compare §1.6) that r may be given a sufficiently large fixed value (inde- 
pendent of v) in series (1.24) to insure convergence of the character indicated 
in the theorem. In fact, I. may be replaced by the circle C, of §1.6. Then the 
resulting solution g(x) of (1.11) is of exponential type not exceeding r where r 
is the radius of the circle C,. By taking r sufficiently small it may be brought 
about that the resulting solution g(x) is of exponential type g; but there is 
no solution g(x) of lower type than g. When ¢(x) is of exponential type q 
a solution g(x) of (1.11) of exponential type g may be called a principal solu- 
tion of that equation. 

1.9. The general case of (1.1) when the ¢,(x) are integral functions. In 
treating this case it is convenient to set forth first a particular solution of 
system (1.9). Again and without loss of generality we take a) =0. 

Form the functions 


1 oo n zt d 
(1.25) g(t) =— OD 


— (k=1,2,---,n 
j=0 Tr h(t) ti+l ( 


where the coefficients s,; are those appearing in (1.13). Now the expression in 
parenthesis under the integral sign is a function of ¢. If one utilizes the form 
of this function of ¢ then by means of an easy modification of the argument 
employed in §1.8 one may show that the series in (1.25) converge absolutely 
and uniformly in every preassigned finite region T of the x-plane and that 
they define integral functions g;(x). 

These integral functions may also be written in the form 


14 R. D. CARMICHAEL . [January 


dt 


the series having the same properties of convergence as before indicated. 
Now by aid of (1.12) we have 


dt 
Agi(x) = > 


TL r; 


> — 


v=] j=0 r; 


An = 


j=0 


the last member being obtained from (1.13). Hence the functions g,(x) in 
(1.26) afford a solution of (1.9) with a)=0. 

That these same functions also afford a solution of (1.7) will next be 
proved. For this purpose substitute these functions g;(«) in the first member 
of (1.7) after replacing v by uw. Simplifying the result by aid of equation (1.8) 
and other ened formulas we have 


d. 


t+1h(2) 


j=0 r; vol 


1 
2 
e*dt 
r, 


fun pil 


Ae**dt 


Hence system (1.7) is satisfied by these functions g;(x). 
Thus we have the following theorem: 


THEOREM 1.4. When the functions $,(x) in the non-singular system (1.7) are 
given integral functions and when the constant ao in (1.10) has the value 0 the 
system has a solution g,(x), k=1, +, m, consisting of integral functions 
defined by equations (1.26), and the series in these equations converge absolutely 
and uniformly in every preassigned finite region T of the x-plane. 


From this theorem it follows that every non-singular system (1.1) has a 
solution consisting of integral functions whenever the given functions ¢,(x) 
are themselves integral. The more special case in which the ¢,(x) are of ex- 
ponential type has already been treated in §1.6. 


SYSTEMS OF LINEAR DIFFERENCE EQUATIONS 


II. SIMULTANEOUS EXPANSIONS OF INTEGRAL FUNCTIONS IN 
COMPOSITE POWER SERIES 


2.1. Formulation of the problem. For »>1 we consider the question of 
expanding integral functions fi(x), fo(x), - - - , fa() simultaneously in com- 
posite power series, that is, we consider the problem of representing these 
functions in the form 


(2.1) fle) = Deals 


j=l 


where the coefficients c;, are to be independent of both x and v. We impose the 
further condition on the coefficients ¢;, that they shall be such that the series 
in the equations 


(2.2) gi(x) G 1, 2, n) 
k=0 


shall converge for all finite values of x; then the sum functions g;(x) defined 
by them will be integral functions. These conditions on the ¢;, are equivalent 
to the conditions that the quantities |c,,|'*, 7=1, 2, - - - , 2, shall all have 
the limit zero as k becomes infinite. Furthermore we subject the given con- 
stants a,; to the condition that the determinant A(#) whose element in vth 
row and jth column is exp(—a,,#) shall not be identically zero as a function of 
t. In the exceptional or singular case in which this condition on A(#) is not 
satisfied the general investigation will require methods different from those 
here employed; and the results will lack the simplicity and elegance which 
belong to the general case here treated. 

For »=1 the problem evidently reduces to the classical problem of ex- 
pansions in power series. We suppose throughout that ”>1. 

Under the conditions named we shall show that such simultaneous ex- 
pansions always exist and indeed that they always exist subject to the further 
condition that the functions g;(x), 7=1, 2,---, , shall be of exponential 
type provided in the latter case that the functions f,(x), v=1, 2, --- , m, are 
of exponential type. 

If we employ the notation defined in (2.2) we may write (2.1) in the form 


(2.3) f(x) = — 

Integral solutions of this system evidently lead through (2.2) to the required 
expansions (2.1). The condition put on A(é) is just that which is required to 
make the results of the first part of this paper applicable to system (2.3) and 
hence to the expansion problem here set. 


1933] 15 
- 


16 R. D. CARMICHAEL [January 


2.2. Expansions in the case of general integral functions f,(x). From The- 
orem 1.4 and the remark following it one concludes that system (2.3) has in 
this case integral solutions g;(x), 7=1, 2,---, m. Therefore we have the 
following theorem: 


THEOREM 2.1. If fi(x), fo(x), - - - , fn(x) are any given integral functions and 
if the constants a,; are such that the determinant A(t) has the property described 
in the first paragraph of §2.1, then these functions f,(x) have simultaneous ex- 
pansions of the form (2.1) where 


lim | cj | = 0 


Formulas in §1.9 afford an effective means of obtaining suitable coeffi- 
cients ¢;, to be employed in the expansions (2.1). Only in exceptional cases is 
it true that these expansions are unique. The determination of the extent of 
arbitrary elements involved in the coefficients of the expansions depends on 
the (as yet undeveloped) theory of system (2.3) for the case when f,(x)=0, 
v=1,2,---,m. 

2.3. Expansions when the f,(x) are of exponential type. Applying The- 
orem 1.2 to system (2.3) in the case when the functions f,(x) are of exponential 
type and interpreting the results in terms of the expansions in (2.1), we have 
the following theorem: 


THEOREM 2.2. If the functions fi(x), fe(x), -- +, fn(x) are of exponential 
type not exceeding q, one at least of them being precisely of type q, and if the 
constants a,; are such that the determinant A(t) has the property described in the 
first paragraph of §2.1, then the functions f,(x) have simultaneous expansions of 
the form (2.1) such that the associated functions g;(x) of (2.2) are of exponential 
type and indeed such that these functions g;(x) are of exponential type not ex- 
ceeding q, one at least of them being precisely of type q. 


When the associated g;(x) are of exponential type not exceeding g we 
‘shall say that the series in (2.1) afford principal expansions of the functions 
I(x). 

Even with the strongest conditions imposed on the coefficients cj, by the 
latter part of the foregoing theorem it is still true that the expansions (2.1) 
need not be unique. In all cases belonging to this section possible values of the 
coefficients c;, are readily determined from the special case of equation 

(1.16) applicable here, as we show in the next paragraph; and these values 
may well vary in dependence upon the radius r of the circle C, appearing in 
(1.16). 


G =1,2,---,n). 


1933] SYSTEMS OF LINEAR DIFFERENCE EQUATIONS 
In connection with the expansions 
= 
form the functions 
Let A,,;(#) be the cofactor of the element in the vth row and jth column of 


A(t). Then by aid of (1.16) it may readily be shown that suitable coefficients 
cj. in (2.1) are the following: 


1 thdt 


Cik AO ’ 
where j=1,2,---,mand k=0,1,2,---. 

2.4. The case a,;=4,, for y>j7. In this case system (2.3) is equivalent to 
the system consisting of the first equation in (2.3) and the following »—1 
equations: 


(2.4) fr(x) — f(x) = — a1.) — g(x — = 2,3,---,n). 
j= 

In case =Gnn it is clear that we must have f,_:(x) =f, (x) as a necessary 

condition for satisfying the system. In fact, it is easy to see that the functions 

f(x) must satisfy one or more special restrictive conditions if one or more of 

the relations 


(2.5) Qy-1,y — Ay 0 


fails to be satisfied. But if conditions (2.5) are all satisfied then we have an 
instance of the general theory already developed; we shall suppose that these 
conditions are satisfied. We assume that the given functions f(x), - - - , f(x) 
are all integral functions. We require that the functions g,/x), ---, ga(x) 
shall be integral functions. 

Taking v =m in (2.4) we see that g,(x) is uniquely determined as an in- 
tegral function except for an arbitrary additive periodic integral function of 
period Gn_i,n—@nn. Taking g,(x) to be any integral function satisfying (2.4) 
for y=n we may then determine g,_:(x) uniquely except for an additive in- 
tegral function of period @n-2,n-1—@n—1,n-1. With g,1(x) determined we pro- 
ceed similarly to the determination of g,_2(x), and we continue thus until 
go(x) is determined. Then the first equation in (2.3) uniquely determines 
gi(x). It appears, therefore, that in the present case one can determine com- 


(v = 2,3,--+,m) 


18 R. D. CARMICHAEL _ [January 


pletely the arbitrary elements in the solution of (2.3) subject to the named 
conditions. Hence all possible expansions (2.1) are completely determined for 
the present case. 

If we further restrict the given functions f,(x), - - - , fn(x) to be of ex- 
ponential type not greater than g we may likewise determine the functions 
gi(x), + - +, gn(”) so that they are of exponential type not greater than g and 
we may show precisely what is arbitrary in the determination of such func- 
tions subject to these conditions. These results may then be carried over to 
the corresponding case of the expansions (2.1). 

There is one case of particular interest in which the expansions (2.1), 
when subject to the condition named in the preceding paragraph, are unique 
except for the trivial restriction that the constants cjo, 7=1, 2,---, m, are 
not separately determined but only their sum is determined. This is the case 
in which the functions fi(«), - - - , f.(x) are of exponential type not greater 
than g while at the same time the relations 


(2.6) q| — < 27 (v = 2, 


are all satisfied. For in this case each g;(x) is uniquely determined except for 
an additive constant. These conditions are obviously satisfied whenever 
inequalities (2.5) hold provided that g=0 and in particular provided that 
the functions f,(«) are polynomials. 

2.5. The case m =2. For the case m =2 system (2.3) may be written in the 
form 


(2.7) + air) = gilx) + go(x + au — 


falx + a2) = gi(x) + go(x + d21 — 


The exceptional case here is that in which ay —@12=d2:—4d22. When this 
condition is satisfied, the system can have a solution only when fi(*+au) 
=fo(x+d2:), as one sees from (2.7); and in this case it is clear that either of 
the integral functions g:(x) and g2(x) may be assigned at will and that the 
other is then uniquely determined: the case is therefore trivial. 

When 41: — 4:2 d21 — dee the case belongs to that treated in §2.4. 

As an application of the case when =@, do: =0 =d22, where 
we see that an arbitrary integral function f(«) may be expanded in the form 


(2.8) f(x) = Di — a)* + ya(x — 


where the sums c,+7x, k=0, 1, 2, - - - , have any preassigned values subject 
to the condition that 
lim | + 


k=@ 


4 
> 
» 
ry 


1933] SYSTEMS OF LINEAR DIFFERENCE EQUATIONS 19 


shall exist and be equal to zero; and the parts of f(x) represented by the com- 
ponent power series in « —a@ and x —6 respectively, when these parts are them- 
selves required to be integral functions, are unique except for an arbitrary 
integral periodic function of period a—b to be added to one part and sub- 
tracted from the other. 

Furthermore, if f(x) is of exponential type not greater than g and if the 
parts of f(x) represented by the component power series in x—a and x—b 
respectively are required to be of exponential type not greater than g, then 
there exists an expansion of the form (2.8) subject to the condition that 


lim sup | (cx + vx)/k!|"/* q; 
k=@ 


and the expansion is unique except for an arbitrary periodic function of period 
a—b and of exponential type not greater than g, such periodic function to be 
added to one component part of f(x) and subtracted from the other. If we add 
the further restriction that g|a—b| <2z then this periodic function reduces 
to a constant, so that the expansion (2.8) is then essentially unique. 

2.6. Generalizations. From the fact established in §1.9 that the non- 
singular system (1.1) always has integral solutions when the ¢,(x) are given 
integral functions it follows that any set ¢:(x), - - + , da(x) of integral func- 
tions has simultaneous expansions in the form 


(2.9) = + a,;)* (v = 1,2,---,n), 
k=O j=l 


where the constants a;; are independent of x and v and where the com- 
ponent functions g;(x), 


= 
k=O 


are themselves integral functions. If the ¢,(x) are subject to the further 
condition that they shall be of exponential type not greater than g then the 
expansions (2.9) exist subject (as one sees from §1.6) to the condition that 
the component functions g;(x) shall also be of exponential type not greater 
than g. If furthermore at least one of the functions ¢,(«) is of precisely type 
qg then one at least of the component functions g;(x) is of precisely type gq. 

These results are capable of extension by means of the generalizations 
indicated near the end of §1.1. 

There is a special case arising from expansions (2.9) to which particular 
attention may be directed. Let a,;= —a;,7=1, 2, --+,m, where d2,---, 


20 R. D. CARMICHAEL [January 


a, are different constants, and let the other a,; have the value 0. Let c;=1, 
j=1,2,--.+,m, while the other c,; are such that the matrix 


C21 


Cni Cn2°** Cnn 


is of rank m—1. Then the corresponding system (1.1) is non-singular. Consider 
the problem of expanding a given integral function ¢(x) in the form 


(2.10) (x) = (x a;)*. 

k=0 j=l 
Since ¢(x) thus takes the place of ¢:(x) in (2.9) and since the remaining 
integral functions 


go(x),- ++, bna(x) 


in (2.9) may be assigned at will, it follows that an expansion of the form 
(2.10) exists (not necessarily unique) such that 
lim | ax; = 0 ] 


k= 


while the quantities 


Bur = jor; (v = 2,- 


j=1 
may be assigned at will subject to the condition that 


lim | 8,.|'/* = 0 


This result affords an interesting generalization of the Cauchy-Taylor ex- 
pansion of an integral function. Whether there exists a corresponding gen- 
eralization for functions analytic in a finite region I have not sought to deter- 
mine. 

If (x) is further restricted to be of exponential type not greater than g 
then there exists an expansion of the form (2.10) (not necessarily unique) 
such that 


lim sup | ax;/k!|"* G =1,2,---,m), 
k=@ 


while the quantities 
Bok (v=2,---,n;k=0,1,2,---) 


may be assigned at will subject to the condition that 


tim sup | 
k=@ 


C22 * * * Con | 


SYSTEMS OF LINEAR DIFFERENCE EQUATIONS 


III. EXPANSIONS IN SERIES OF EXPONENTIAL FUNCTIONS 
3.1. Properties of exponential sums. Let us denote by /(¢) the function 
(3.1) h(t) = cye™* + + c,e%, n> 1, 


where 41, a2, - - , @, are different constants and - - , are constants 
different from zero. And let us consider the problem of bounding away from 
zero the function e~*‘h(#) for suitable given values of x and for suitable ranges 
of ¢. The results are needed for our later investigation (§3.2) of certain contour 
integrals. 

Let P be the smallest convex polygon, in the complex plane, containing 
the points a, d2,---, dn; this polygon may in special cases reduce to a 
straight line segment. Let Q be the polygon* obtained by reflecting P through 
the real axis. For the sake of definiteness we suppose that the notation is so 
chosen that the vertices of P, taken in counter-clockwise order, are 4:1, ds, 

- ++, a, (vm) and that no a; has its real part less than that of a:. Moreover 
we suppose that the vertices are so taken that no three of these a’s at the 
vertices lie on the same straight line. Let /,, , - - - , 1, be the rays normal to 
the sides of Q at their centers and drawn outward from this polygon; when Q 
reduces to a straight line it is to be understood that these rays are two in 
number and that they are drawn so that there is one in each direction from 
the middle point of the line. We take the notation so that 1,, k, - - - ,/, are in 
clockwise order and so that /, is the normal to the side joining the conjugates 
of a; and a». 

Let a; and a; be two consecutive vertices of P and let J, be the normal to 
that side of Q which joins the corresponding vertices of Q. If R(z) denotes the 
real part of z, then the line R(a,t) = R(a;#) is parallel to the line J,. Let p be a 
positive number whose value is later to be conveniently restricted. On each 
side of each line 1, 2, - - - , 1, and at a distance p from it draw a ray in such a 
way that these rays will make a sort of infinite star similar to that considered 
in §1.7 and containing the rays /,, J2, - - - , J, in the centers of its arms. These 
rays form certain sectors S, similar to those in §1.7 and containing no in- 
terior points of the named infinite star. 

In order to have sectors exactly like those in §1.7 it is necessary to divide 
some of the sectors S into smaller sectors by excluding other strips; but this 
further division is to serve only a temporary purpose in the argument. It may 
be described as follows. Let mm, me, - - - , m, be rays from zero to infinity 
parallel to 1, 2, - - - , l, respectively but such that m, goes to infinity in a 
direction opposite to that of J,. Some rays m; may go to infinity in the same 


* Such polygons as P and Q have been employed by Pélya, Mathematische Annalen, vol. 89 
(1923), pp. 179-191. 


1933] 21 


22 R. D. CARMICHAEL [January 


direction as other rays /; (and they will do so when Q has pairs of parallel 
sides) ; remove such rays m;; if any rays m, remain after this removal, denote 
them by m,, ms, - - - . Along the rays m,, mg, - - - remove strips of width 2p 
as in the case of the preceding paragraph. Then some sectors S are separated 
into two or more sectors (together with one or more strips). After all such 
separations are made, let S’ be a symbol to denote the totality of sectors ob- 
tained, including undivided sectors S and the parts into which some sectors 
S have been separated. 

From Lemma 1.1 it follows that p may be taken sufficiently large that 
h(t), and hence e~*‘h(t), shall have no zero in any sector S’. Moreover, from 
the same lemma it follows that p may be taken sufficiently large (and we so 
take it) that e~*‘h(t) is bounded away from zero in the sectors S’ when x is any 
one of the points a), dz, - - - , @,. In fact, when x has any such value the func- 
tion e~*th(t) is a function meeting the conditions on A(t) in §1.7 so that 
Lemmas 1.3 and 1.4 are also applicable to e~*‘h(t) for such values of x. 

Let a;, a, and a; be any three consecutive vertices of P in counter- 
clockwise order and let J; and J;, be the rays perpendicular to the sides of Q 
with corresponding vertices. Let S;, be the sector S lying between /; and /,. 
Suppose that ¢ varies in S;,. Let x be a fixed point in P. We have 


where d, is the conjugate of a,. The last factor in the second member is 
bounded away from zero for large ¢ in the named sector, as we have already 
seen. The middle factor is a constant different from zero, since x is fixed. The 
argument of the exponent of the first factor lies between —}z and 37 in- 
clusive, as one may readily show graphically, if (as we do by taking p suffi- 
ciently large) we restrict the sector S;, to lie in the sector formed by rays 
from 4d, to infinity in the direction of the rays ]; and /,: in establishing the 
named fact it is convenient temporarily to transform the points of the plane 
by adding — 4, to each value in it so that the representation of 4, becomes the 
point zero and then to begin from the plots of ¢—d, and t—4d,. Thence it 
follows that e~*‘h(¢) is bounded away from zero in the named sector. Further- 
more it follows from Lemma 1.3 that e~*th(#) is bounded away from zero for 
all large ¢ in rectangles congruent to the rectangles R in the way specified in 
that lemma, these rectangles R being chosen with reference to the function 
e~**h(t). 

Let us now further restrict x to lie in the interior of P. Then there exists 
a positive number ¢ such that 


— e+e Sarg — x)\(t—a&)} S4r-e. 


1933] SYSTEMS OF LINEAR DIFFERENCE EQUATIONS 23 


Thence it follows that the function ¢—'e-=*h(#) is bounded away from zero as ¢ 
becomes infinite in the named sector. 

The same function is also bounded away from zero if x is on the boundary 
of P but not at a; while ¢ becomes infinite in the named sector in such a way 
as to remain outside of each of two parabolas with vertex at d, and having the 
named rays from 4, parallel to /; and J, as their principal diameters. 

It may now be observed that every strip along one of the rays m,, ms, - - - 
lies (except for a finite part of it) entirely in a sector S and that it has a direc- 
tion intermediate to the directions of the bounding rays of this sector S. 
Thence it follows also that such a strip (except for a finite part of it) lies 
entirely outside of the parabolas along the bounding rays of this sector S. 
Hence the strips along the rays m,, ms, - - - may be removed and we thus 
return to the set of sectors S as defined in this section; and for the plane so 
divided we have the requisite character of e~*‘h(#) or ¢-'e~*‘h(#) as a function 
bounded away from zero, in accordance with the paragraph next following. 

Summing up these results we may state that e~*‘h(#) is bounded away 
from zero for any given x in P and for all large ¢ in all sectors S formed with 
sufficiently large p and in all rectangles congruent to rectangles R in accor- 
dance with Lemma 1.3; that ¢-'e-=‘h(¢) is bounded away from zero for each 
interior point x of P and for all large ¢ in all such sectors S; and that ¢-'e~*‘h(t) 
is bounded away from zero for each x on the boundary of P and not at a ver- 
tex of P and for all large ¢ in all such sectors S and outside of all parabolas of 
the sort described for d; in the previous paragraph, two such parabolas being 
formed at each vertex of Q. 

3.2. Properties of certain contour integrals. Let C:, C2,---,C,,-+-+ be 
a set of different contours in the complex plane such that any given point on 
C; is either interior to C;4; or on C;4: and such that for every s there exists an 
r such that the contour C, is a contour I’, of the sort described in §1.7 and 
suitable to apply to e~*'h(¢) for points x in P as the contours I’, apply to the 
function h(#) of §1.7 and such that for every r there is an s such that C, is a 
contour I,. 

Let y(t) be any function of ¢ which is analytic at infinity and vanishes 
there and let us write 


(3.2) = 
Let r be a fixed integer such that the contour C, lies entirely within the region 
of convergence of the series in (3.2). Form the function F,(x), 


1 
(3.3) F(x) = — | 
201 Ce 


Then F,(x) is a function of exponential type; and, in fact, it is such a function 


24 R. D. CARMICHAEL [January 


as arises from the solution of equation (1.11) when ¢(x) is a given function of 
exponential type, as one sees from Part I and especially from §1.6. 

Let p be any positive integer and form the function F,+,(«) by changing 
r tor+p in (3.3). We shall show that 
(3.4) lim F,4,(x) = 0 


when any one of the following conditions is satisfied: 


(1) when x is in the interior of P; 
(2) when x is on the boundary of P and is not a vertex of P; 
(3) when x is a vertex of P provided in this case that y, =0. 


It is convenient to carry out the proof first for the case when y:=0. Then 
a number M exists such that | -*y(¢)| <M on all the contours C,,,. We let x 
be any point of P either in the interior or anywhere on the boundary. Then 
from the results at the end of §3.1 it follows that a constant M, exists such 
that | e**{ h(t) }-*| <M. Hence there is a constant Mz such that in this case 


we have 


| Frsp(x) | < Ms f || de]. 
This implies the truth of (3.4) when y:=0 and x is anywhere in P. 

With this result in hand we see that (3.4) will be established in the three 
cases (1), (2), (3) if we further prove its validity in cases (1) and (2) for the 
particular function ¥(#) =1/t, since we may then pass to the general case in 
an obvious manner. 

In case (1) let us write 


where S,,, denotes the set of paths consisting of the parts of C,;, which lie 
in the sectors S while A,,, is the set of paths consisting of the remaining parts 
of C,,». Then on A,,, the integrand has a dominant of the form M/|¢| while 
on S,4» it has a dominant of the form M/|#*|, as one sees from the results in 
the last paragraph of §3.1. Thence we conclude readily to the truth of (3.4) 
for the present case, since the total length of the parts A,,, is bounded. 

In case (2) we may use notationally the same equation (3.5) where we now 
understand that A,,, denotes the set of paths consisting of the parts of C,+, 
which lie in the parabolas described near the end of §3.1 while S,;, consists of 
the remaining parts of C,,,. The conclusion that (3.4) is valid in the present 
case is reached in the same way as in the preceding paragraph but by using 


{ h(t) 


r+p Pp 


1933] SYSTEMS OF LINEAR DIFFERENCE EQUATIONS 25 


the additional fact that the total length of the parts A,,, bears to the mini- 
mum distance d from zero to points of A,,, a ratio which is infinitesimal as 
r+ p becomes infinite. 

Thus the relation (3.4) is established for all points x of P except that when 
x is at a vertex of P we require that 7; shall have the value zero. 

3.3. Expansions in series of exponential functions. Let S,,,(x) denote 
the negative of the sum of the residues of the function e*‘{ h(¢) }—*y(¢) in the 
region bounded by the contours C,,,_; and C,+». If the function has no singu- 
larity in this region we shall understand that S,,,(x) is identically zero. In 
all other cases S,(x) is a function of the form ce** or a sum of a finite number 
of such functions. We have 


F(x) — = DoS r4n(x). 
k=l 
If we suppose that x is a point of P and in case y:~0 that it is not a vertex 
of P then relation (3.4) is applicable to the foregoing equation when is al- 
lowed to become infinite and we have the following theorem: 


THEOREM 3.1. The function F,(x) defined in (3.3) has the expansion 


(3.6) = 

k=l 
in series of exponential functions, valid for all values of x in the polygon P, 
except that the vertices are to be excluded when >, 0. 


In the special case when A(t) =e'—1 the series in (3.6) is a Fourier series. 
The polygon P in this case reduces to the interval (01) of the real axis, the 
end points of the interval serving as the vertices of the polygon. A further 
treatment of Fourier series from this point of view will appear in a forth- 
coming paper in Annals of Mathematics. 

The foregoing theorem serves to expand in series (3.6) any whatever 
function that may be put in the form (3.3). If 2(0) #0 it is evident that any 
given polynomial in x may be put in the form F(x) by taking C;, to be a small 
circle about 0 as a center and by choosing ¥(#) properly as a polynomial in 
1/t. The function F,(x)+constant may also in other cases represent any 
whatever polynomial in x. Hence, in particular, all polynomials have expan- 
sions in the form (3.6), or in this form with an additive constant, valid in 
polygons P as indicated. 

3.4. A special class of the foregoing expansions. We shall now examine the 
special case of the foregoing expansion theory in which the function 
e*'{ h(t) has the form 


R. D. CARMICHAEL [January 


ext 


(et — 1)(e! — 1) (emt — 1) 


(3.7) 


where 1, p2, , Pn are real or complex constants different from 0 and such 
that neither the sum nor the difference of two of them is zero. 

It is convenient, for the sake of simplicity, to normalize the problem by 
means of certain elementary transformations. If p,; has a negative real part 
we may replace p; by —p, by multiplying both numerator and denominator 
in (3.7) by —e-*t and so obtain (except for an irrelevant change in sign) a 
similar expression with x replaced by x—p,; by a translation in the x-plane 
we may then replace x—p; by x. We suppose all such translations made so 
that we shall assume that the real part of each p; is positive or zero. Then the 
further conditions on pi, p2, +--+, px are that they are different from each 
other and from zero. Then the point zero is on the boundary of the polygon P, 
introduced (§3.1) in the general case, and the greatest real value of a point in 
P is the sum of the real parts of p:, p2, - - - , Pn. We suppose that the notation 
is so chosen that 


(3.8) — 40 S argo: S argpe2 S S argpn S 
By means of a straight line join each point (except the last) in the set 
O, p1, + p2 + pit pa, 


to the one which follows it, thus forming a convex polygon of an even number 
of sides and having its sides parallel in pairs. This is the polygon P, as one 
sees by examining points x in the sectors formed by adjacent sides. Then the 
points x in P are the points 


(3.10) x = + + +++ + (OS =1,2,---,n), 


(3.9) 


as one sees by aid of the fact that each of these points lies in the strips each 
of which is bounded by two parallel sides of P and by showing that every 
point in P is a point x of the named form. The boundary of P is traced out in 
counter-clockwise order by starting with all \’s equal to zero, then letting 
1 increase from 0 to 1, then dz from 0 to 1, and so on to X, letting it increase 
from 0 to 1, then letting \, decrease from 1 to 0, \2 from 1 to 0, and so on till 
d, decreases from 1 to 0. 
For every point x in P the function (3.7) may be written in the form 


ert 


(3.11) 


(ent —1)---(emt—1) — 1. 


26 
| 


1933] SYSTEMS OF LINEAR DIFFERENCE EQUATIONS 27 


where 51, k=1, 2,---, m. For this special case the inequalities ob- 
tained in §3.1 may be derived in a very simple manner, as one may see by 
applying the methods of §§1.7 and 3.1 separately to each factor of the 
second member of (3.11) and simplifying the procedure in obvious ways for 
these special cases. 

Moreover, when no two of the p; have a real ratio, the contours C;, C2, - - - 
may be chosen so that C; incloses just & zeros of h(#) for k=1, 2, - - - . Hence 
the terms S,,;(x) in (3.6) may all be taken in the form ce so that we have to 
do with expansions of the form* 


m=1 kel 
In what follows in this section we shall suppose that no two of the numbers 
px have a real ratio. Then no two terms in the series (3.12) involve the same 
exponential function. 
With each of the functions 


(3.13) 1, (k = 1,2,---,;m = 1,2,3,---) 


let us associate its reciprocal and let us call this associated function the ad- 
joint of the given function. If we multiply any whatever function of the set 
(3.13) by the adjoint of any other function in the set, we have a product of the 
form 


n 


k=l 


where at least one and not more than two of the integers /; are different from 
zero. There is a side of the polygon P on which x/p, ranges from 0 to 1; on 
that side we denote x/p; by Ax. Then 


1 1 1 n 
(3.14) f f f ( = 0. 
0 0 0 k=1 


If a like integral is formed with a function of the set (3.13) and the adjoint 
of that function then the integral corresponding to (3.14) has the value 1. 
Hence we have conditions of biorthogonality generalizing those pertaining 
to the case of Fourier series, here arising when m = 1. Consequently we have a 
formal method of determining the coefficients in series (3.12) for a much more 
extensive class of functions than those for which we have already established 
the validity of such expansions. This suggests the generalization of the whole 


* Series similar to those in (3.12) have been treated by P. Bohl, Magisterdissertation, Dorpat, 
1893, and Journal fiir Mathematik, vol. 131 (1906), pp. 286-321. 


28 R. D. CARMICHAEL 


theory of Fourier series to the particular class of series in (3.12) if not indeed 
to the more general class of §3.3; but we shall not now pursue these general- 
izations. 

Other generalizations of the whole theory developed in this part of the 
paper will readily occur to the reader, including among others such exten- 
sions of the Birkhoff expansion theory as are parallel to the foregoing exten- 
sion of the theory of Fourier series and also the extensions of these theories 
to the expansions of functions of several variables; but these also we leave 
to a future investigation. 

3.5. Applications to Bernoulli polynomials. The theory in §3.4 affords 
elegant expansions of Bernoulli polynomials of higher order, namely, the 
polynomials B defined by the identity 


Pip2 * * Pate?! > 


— BY 
(emt v! B,(x| pry 5 Pn)- 


(3.15) 


From this identity we have 


vipip2*** Pn f dt 
i c 


(n) 
3.16 B, (| Pn) = 
(3.16) (x| ps » Pn) (emt — 1) — 1) prt 


where C denotes a small circle about the point zero. Our theory is effective 
for values of v not less than n. 
Thus we have in particular the expansion 


1 { etkrz erkriz 


! 


Qn 
(3.17) 


= j tke 1 


e7tkriz 
+ (- (v = 3,4,5,---). 

According to the general theory this series must converge for those 
values x, x=u+iv, for which u and v run independently over the closed 
interval (01). By considering separately the four cases u <0, u>1,7<0,v>1, 
it is easily shown that the series diverges in each case through having terms 
in the brackets become infinite in an exponential way as k becomes infinite. 
Hence the whole region of convergence of the series is the square whose ver- 
tices are 0, 1, 1+, 7. 

From this example it follows that the polygon P of convergence in the 
case of the general theory can not be extended to a larger region in which the 
series always converges. 


UNIVERSITY OF ILLINOIS, 
Urpana, ILL. 


CONVERGENCE CRITERIA FOR DOUBLE FOURIER 
SERIES* 


BY 
J. J. GERGENT 


1.1. Introduction. We shall consider here analogues for double Fourier 
series of certain convergence criteria for simple Fourier series. The tests for 
simple series in question are the familiar tests of Dini, Jordan, de la Vallée- 
Poussin, Lebesgue, Young, and Hardyand Littlewood, and the tests obtained 
by various authors in generalizing the Young and the Lebesgue test. All 
these criteria are stated, the logical relations between them discussed, and 
references to them given in the author’s paper 4.{ Rather than duplicate 
this material here we refer the reader to that paper. Analogues for some of 
these tests have appeared in the literature. Our first purpose here is to es- 
tablish analogues of those remaining. Our second purpose is to discuss the 
logical relations between the tests for double series. We obtain, incidentally, 
an extension of Tonelli’s convergence criterion for double series which deals 
with functions of bounded variation. Statements of our results and a general 
summary of the convergence theory are to found in §§1.2 to 1.6, the proofs 
of our theorems, in §§2.0 to 13.1. We do not always attempt to model the 
proof of a generalization after the proof of the original; but deduce first a 
test of the Lebesgue type, and from it the other tests. We thus obtain at the 
same time information as to the relations between the tests. 

1.2. We suppose once and for all that the double Fourier series in question 
is that of an even-even function f(u, v) which is integrable in the Lebesgue 
sense over the square Q(0, 0; 7, 7) and is doubly periodic with period 27 
in each variable. Further, we shall confine our attention to the behavior of 
the Fourier series of f at the origin. We have, then, 


S(u, ~ Am,ndm,n COS MU COS ND, 


m,n=0 


where 


* Presented to the Society, March 25, 1932; received by the editors May 25, 1932, and, in re- 
vised form, August 22, 1932. 

7 A number of the results contained in this paper were obtained while the author was a Na- 
tional Research Fellow. The problem of obtaining a generalization to double series of the Lebesgue 
test for simple series was suggested to the author by Professor Hardy; and the author wishes to 
thank him for this and other suggestions. The author also wishes to thank Dr. Agnew for reading the 
manuscript of this paper and suggesting several corrections and improvements. 

~ Numbers in bold face type refer to the Bibliography at the end of this paper. 


29 


J. J. GERGEN [January 


Xo,0 = i, and Am.0 => => 3, = 1 for 0 < m, 0< nN, 


4 
onan = — v) cos mu cos nv dudv. 
1? Q 


The series for f at the origin is 


(1.21) 


m,n=0 
the partial sum s,,,, of order m, n of this series is 


m n 1 . 
mJ JQ sin sin 30 


j=0 


and we are concerned with the limit 


lim Simin 
taken in the Pringsheim sense.* Any test for the convergence of the series 
(1.21) yields, of course, a test for convergence of the Fourier series of an 
arbitrary integrable function at an arbitrary point. 
1.3. To simplify the writing we employ a form of the Landau limit nota- 
tion. Given two functions h(x, y) and (x, y), defined for all sufficiently small 
positive values of x and y, we write 


(1.31) h(x, y) = of¥(x, y)} 

if, corresponding to each number 0 < e, we can choose 0 <6, so that 
(1.32) 

for 0<x<6.,0<y<é,. We write 

(1.33) h(x, y) = OLW(x, 


if (1.32) holds for some « and all sufficiently small positive values of x and y. 
Given two functions h(x, y; k) and W(x, y; k), defined for each large value of k 
for sufficiently small positive values of x and y, we write 


(1.34) h(x, y; k) = y; k)} 


* Pringsheim, 12, p. 103. The series (1.21) converges, to sum s, or 
lim Sn = 
in the Pringsheim sense, if there corresponds to every number 0<e an integer NW such that, if VNSm, 
then |smn—s| Se. 


30 
and 
§ 


1933] DOUBLE FOURIER SERIES 31 


if, corresponding to each 0 <e, we can choose, first, 0<,; and then, 0<6,,., 
so that (1.32) holds for 


We write 
(1.36) h(x, y; k) = Ofw(x, y; k)} 


if, corresponding to some e, we can choose k, and 6;,, as above so that (1.32) 
holds for all x, y, and & satisfying (1.35). 

1.4. The known tests for the convergence of the series (1.21) which are of 
interest here may now be recalled. In stating these, and in what follows, we 
understand that letters capped by bars, (D), (J), etc., have the same mean- 
ings as the same letters without the bars in the author’s paper 4. Letters 
without bars refer to conditions and tests for double series. In some of the 
tests there are two or three conditions. We shall always regard the set of 
conditions in any test as a single condition and denote it by the same notation 
as we use to denote the test itself. Similarly, when a test involves but one 
condition we denote it in the same way as we do the test itself. 

The conditions sufficient for the convergence of the series (1.21) are in 
(Dy) Young’s analogue of Dini’s test (D):t 


d 
0 0 


where s is a constant and &, & are functions such that &(u)/u, &(v)/v are in- 
tegrable over (0, 7); 


{ In case y is a non-vanishing function, then (1.31), (1.33), (1.34), (1.36) respectively holds if, 
and only if, 
limit  (h/y) =0, fimit 


(z,y) (+ 0,+0) 


k- (x,y) (+ 0,+0) 

t Young, 15, p. 182. About the same time as Young’s paper appeared Kiistermann, 10, p. 28, 
published an analogue of Dini’s test. Later a test of this type was given by Merriman, 3, p. 129. 
The test (Dy) is not exactly the test of any of these authors. It includes them all as particular cases 
however. 

Young’s condition is 
(Dy*) the function (f—s)/(uv) is integrable over Q. 

Young proves that, if (Dy*) is satisfied, then so also is (V y*) (see the footnote on (V y) above). Using 
Young’s method it is not difficult to show that (Dy) implies(V y*). Thus (Dy) is a sufficient condition 
for convergence. 


32 J. J. GERGEN . [January 


(Ju) Hardy’s analogue of Jordan’s test (J):t 
(Ji) fis finitely defined everywhere in Q and 


where the integral represents the total variation of f over Q,t and 


0 0 


where the first integral is the total variation of f(u,0), and the second, the total 
variation of v), over (0, 

(Jr) Tonelli’s analogue of (J):|| 

(Jr) fis finitely defined everywhere in Q and 


Vato) = f < VO), Vale) = f "| deflu, »)| < V(u), 


where V is integrable over (0, 3), the first for every V, and the second for every u, 
on (0, and 


W(x, y) = lim fil d,f(u, y) | o(1), 


(Jr'’) 
Wx, y) = lim f | def(x, ») | = o(1).** 


t Hardy, 5, p. 65. 
t The total variation of f over the rectangle (a1, b:; a2, be), 


is defined as the upper bound of all sums of the type 


—f( ui, 0;-1) —f (usa, +f (uss, 23-1) | 


where i=0, 1,---, m, and 7=0, 1,-++, m, are any numbers such that --- 
< tm = da, = =e. 

§ Hardy states this second condition as 
Jo lduf(u,»)|<, fo|dsf(u,v)| 
the first for every v, the second for every u, on (0, x). 

It is pointed out by Young, 15, p. 142, that, if (Jy’) and (Jy’’) hold, then (Jy’’*) holds. 

|| Tonelli, 13, p. 455, and 14. 

4] This condition is stated by Tonelli as 
(Jz *) Vu, V2 are integrable over (0, ). 

Since, as Professor Adams pointed out to the author, there are functions for which V; and V2 
are not measurable, there is some gain in generality in taking the condition as we do. That (Jr ) 
and (Jr’’) are sufficient conditions for convergence, we prove by Theorems I, II and III below. 

When a function satisfies (J) it may be said to be of bounded variation in the Hardy sense, and 
when it satisfies (J7 ), of bounded variation in the Tonelli sense. Other definitions of bounded vari- 
ation have been given by various authors. For a complete discussion of these, see Adams, 1. For a 
convergence theorem involving another definition, see Hobson, 2, p. 705. For further references on 
convergence criteria, see Tonelli, 14. 

** Tonelli states this condition in a slightly different but equivalent way. 


i 


1933] DOUBLE FOURIER SERIES 


(Vy) Young’s analogue of de la Vallée-Poussin’s test (Y):} the mean value 


1 u v 
(1.41) F(u, 0) = — de flo, tdt 


of f satisfies (Ju). 

In (Dy) the sum of the series is s,in (Jn) and (Jr), f(+0, +0), and in (Vy), 
F(+0, +0). 

Each of the above tests is plainly analogous to the corresponding test for 
simple series. There is, however, one aspect in which these tests and, in fact, 
all the tests given here for double series, differ from the original tests. In each 
test for double series there is some condition on f, other than integrability, 
over the whole square Q, whereas for simple series, the only conditions im- 
posed, other than integrability, are neighborhood conditions. Now, by the 
analogue of the Riemann-Lebesgue theorem, { the behavior of f in the square 
(6, 5; z, 3), provided 0<6, has no effect on the convergence of the series 
(1.21). Thus this difference could be partially eliminated; but we cannot, as 
might be expected, confine our conditions to neighborhood conditions. Con- 
ditions on f in the “cross-neighborhood” of the origin are essential. Some of 
the above tests were originally stated with only cross-neighborhood con- 
ditions, and we could state those which follow thus. We state the tests as we 
do for simplicity. 

From each of the above tests the corresponding test for simple series can 
readily be deduced. We have, when f is a function of u alone, f =f(), say, 


~ fA 
a 


sin 


which is the mth partial sum of the simple Fourier series of f corresponding 
to the point w=0. Now, if f satisfies (D), f satisfies (Dy); and if f satisfies 
(V), f satisfies (Vy). Hence from (Dy) we can immediately deduce (D), and 
from (Vy), (V). The passage from (Jz) and (Jr) to (J) is not so immediate, 
but a simple application of the Riemann-Lebesgue theorem leads directly 
to the desired conclusion. 

1.5. An examination of §1.4 reveals that the types of tests for simple 
series which have not been considered for double series are Young’s, Hardy 


7 Young, 15, p. 170. Young states his condition in another but equivalent form, namely: 
(V y*) Fuv csc u csc v satisfies (Ju). 

The right-hand member of (1.41) has, of course, no meaning when u=0 or when v=0. It is 
implied that F can be so defined on the axes as to satisfy (Jy). 

t For this analogue, see Young, 15, p. 138. 


33 


34 J. J. GERGEN © [January 
and Littlewood’s, and Lebesgue’s. Listed below is our extension of Tonelli’s 


criterion and the analogues we obtain of tests of these types. 


The conditions sufficient for the convergence of the series (1.21), to sum s, are, 
in 
(Jr) our extension of Tonelli’s test (Jr): (J7’), 
y) O(1), W(x, y) = O(1), 


and 
(C1) oi(x, y) = f v)dv = o(xy), 


whered =f — S$; 
(Y) our analogue of Young’s test (Y): 
(Y’) f is finitely defined everywhere in Q and 


z 
(1.51) f f | du.» { uof(u, v)} | Axy frO< 
0 0 


where A is independent of x and y, and 
(Co) o(x, y) = o(1); 


(Yr) our analogue of Pollard’s generalization (Yp) « of (Y): (V’) and (Ci); 
(HL) our analogue of Hardy and Littlewood’s test (HL): 


(AL’) f Az, uf(u, v) = O(xy), 


where 


Ary f(u, 2) =f(u+z, v+y) —f(ut+z, v) —f(u, ot+y)+f(u, 
for some 


ff = f ‘de f = O(2y), 


wheret 


Af v+ y) f(u, = f(u + x, v) f(u, »), 
for some 1S p2 and some 1S and (C;); 


+ There is some confusion in the notation A,f, Ayf, and Az,,f, but this is not serious. Whenever 
we have A,h, h has u as one of its arguments, and A,h is the first difference 
Ash = 
Similarly, y always appears with and is coupled with v. Finally, whenever we have Az,,h, 4 has u and 
v as two of its arguments and A,.,/ is the first double difference 
Azyh = 


$ 


1933] DOUBLE FOURIER SERIES 


(L;) our analogue of Lebesgue’s test (Ls): 


(L!) f = o(1), 
a) f “dw f = f "de = 0(9), 


and 
= f f "| 9) | do = xy); 


(L2) our analogue of Lesbegue’s test (L:): 


(i) (x, 9) = dv = o(1), 


and 


(Lz’) 


(Lp) our analogue of Pollard’s generalization (Lp) of (Lz): 


n= af a} * au = ay), 

and (C,); 


(Le) our analogue of Gergen’s generalization (Lr) of (Li): 


‘ ™zdyu dv _ 
kz u ky v 


and (C 1) . 


35 

| 

| 
| 


36 J. J. GERGEN [January 


From each of these tests the corresponding test for simple series can im- 
mediately be deduced; but it will be noticed that the most general continuity 
condition we use is (C;) and not the analogue of (C), namely: 

(C) the series (1.21) is summable, to sum s, by some Cesdro means. 

Thus we fail to extend completely to double series the tests (HL) and (Zz), 
and we fail to obtain any analogue of Hardy and Littlewood’s generalization 
(Yuz) of (Y) other than (Yp). The problem remains unsolved whether we can 
replace (Ci) by (C) in (Ze), (Yp), and (HL). This problem, if one follows the 
ideas in simple series, involves proving that the characteristict conditions of 
these tests imply the equivalence of (C;) and (C), and this in turn involves 
obtaining a generalization to double series of Hardy and Littlewood’sf 
theorem on summability and continuity. Of course the fact that each of the 
above tests leads directly to the corresponding test for simple series is due 
largely to the equivalence of (C;) to (C) whenever the latter is used. 

To establish the above tests we deduce first (Lz) and then show that (Le) 
contains all the other tests as particular cases. The facts in regard to (Lr) we 
state for convenience in the following theorem, the proof of which is to be 
found in §§2.0 to 3.1. The facts in regard to the relation of (Lz) to the rest of 
the tests are given in Theorems II to VII below. 


THEOREM I. If (Lp) holds, then the series (1.21) converges, to sum s. 


1.6. Turning now to the logical relations§ between the tests, we first 
state the following theorems, the proofs of which are to be found in §$§4.1 
to 13.1. 


THEOREM II. (a) The conditions (L,) and (Lz) are equivalent. (b) The con- 
dition (Lz) implies (Lp), while (Lp) implies (Lz) if (Ci*) holds. (c) The con- 
dition (Lp) implies (Lr), while (Lr) implies (Lp) if 


(1.61) oi*(x, y) = O(xy). 
Thus (Ly), (Lz), (Lp), and (Lr) are equivalent if (C:*) holds.|| 


+ The characteristic condition of a test consists of the conditions individual to the test. It is to 
be distinguished from the continuity condition, which is either (Co) or a generalization of (Co). 
In (Lp), for example, the characteristic condition is (Lg )-+(Lr’ ). 

¢ Hardy and Littlewood, 7. 

§ All our conclusions here are to the effect that certain conditions imply others. We make no 
attempt to prove that the implications stated are not reversible. For some examples of this type, see 
Hardy, 6. Hardy’s examples are for simple series, but conclusions for double series can easily be de- 
duced from them. 

In discussing these relations it is well to point out again that, because one test is included in 
another, the latter is not for that reason a better test. If one reasoned in that way the best test would 
be the one in which the condition for convergence is that the series converge. 

|| This theorem contains som< new information for simple series: that (Lp) implies (Z2) if (C*) 
holds, and that (Zz) implies (Zp) if the condition corresponding to (1.61) holds. 


| 


1933] DOUBLE FOURIER SERIES 37 


THEOREM III. The condition (Jr) implies (Jz) and (L;), both with s =f(+0, 
+0). Moreover, (Jz) implies (Lp). 


THEOREM IV. The condition (Y) implies (Y p) and (Li), while (Y p) implies 
(Lp). 
THEOREM V. The condition (HL) implies (Lr). 


THEOREM VI. The condition (Jz) implies (Y) and (HL), both with s=f(+0, 
+0), and the latter with pi = p2=p3=1. 


THEOREM VII. The condition (Vy) implies (L;) with s=F(+0, +0). 
Ju 
| 
+Jr’ 
| 
Jr=Jt Y 


Combining these results with the fact that both (Jy) and (Dy) imply 
(Vy),¢ and (Jz) implies (J7),} we obtain the accompanying diagram. In this 
diagram a directed line running from one letter to another indicates that the 
condition represented by the former implies that represented by the latter. 
In any implication in which s occurs only in the implied condition, s is under- 
stood to have the value indicated in the above theorem concerning this 
implication. It should be noted that, aside from the differences due to our 
use of (C;) rather than (C), there are only two essential differences between 
this diagram and the author’s§ diagram for simple series: first, we do not 
indicate here any implications between characteristic conditions; and 
secondly, we have here two new conditions, (Jr) and (Jz). In regard to the 
first difference it might be pointed out that our proofs show that all implica- 
tions indicated for simple series carry over to double series, and thus that, in 
particular, the characteristic condition of (Ze) is implied by every other 


T See the footnote on (Dy) and Young, 15, p. 181. 
t Tonelli, 13, p. 470. 
§ Gergen, 4, p. 257. 


Vy 

| | 

+14" 


38 J. J. GERGEN [January 


characteristic condition. In regard to (Jr) and (Jz), these conditions, while 
analogous to (J) in some respects, do not seem to be contained in (Vy), (Y), 
and (HL). For this reason, and also because of their general character, these 
conditions seem to be essentially connected with a space of higher dimension- 
ality than one. 

2.0. Lemmas for Theorem I. The proof of Theorem I rests on the follow- 
ing lemmas. In these lemmas and throughout the rest of the paper we write 
for convenience 


_ 1 | 1 
K(u, v; x, y) = K(u, v) = sin — csc — usin — csc — 
x 2 y 2 
We shall always suppose that x, y, k are numbers such that 
(2.01) 


We understand by A a number whose value is independent of all or any group 
of the variables u, v, x, y, k with which we are concerned at the moment, for 
those values of the variables in question lying in the proper range. The range 
for u and that for v is always specified. The range for x either is completely 
specified or else it is understood to be that part of the range indicated in (2.01) 
not excluded by any partial specification. A similar understanding holds with 
regard to y and k. 

We shall often_have occasion to use the following formula for integrating 


by parts: 
a, b, 
d d 
(2.02) = — f bs)du 
by a, by 
pi(a2, v)¥ (a2, v)dv + du PWurdd, 

where 


¥(u, v) = 


This formula is valid if p is integrable on (a;, b:; de, be), y’ is absolutely con- 
tinuous on (a, a2), and ’’ is absolutely continuous on (dy, b2).T 


t This formula can readily be established by a double application of the formula for integrating 
by parts an integral involving but one variable and the application of other familiar results in the 
theory of Lebesgue integrals. The only question likely to occur is that of the measurability of the 
function Seu, é)dt and this question is answered in a theorem of Carathéodory, 2, p. 656. In any 
case the formula is a particular case of one given by Hobson, 8, p. 666. 


4] 


1933] DOUBLE FOURIER SERIES 


We have 
and our problem is to show that, if (Zz) holds, then 
S(x,y) = f oxo, 0; x, y) = o(1). 


As in the proof of (Lz) the problem is solved by breaking this integral into 
several parts and considering each part separately. In the lemmas we con- 
sider integrals over the region “near” the boundary of Q and also several 
functions which occur in the treatment of the integral over the area “away” 
from the boundary. 


2.1. Lemma 1. Jf (Ci) holds and if 0<a, 0<b, then 


(2.11) I(x, y) = [au foxes = o(1). 


Further, 


=f au f oK dv = o(1) 
w—az w—by 


under the sole assumption that ¢ is integrable in Q. 
We have 
uv|K | <A, wuvx|Ku| <A 


(2.12) 
uvy| K,| <A, uvxy| Kuo| <A 


\ ford 
Further, applying (2.02), 


I, = ¢i(ax, by) K(ax, by) — f oi(u, by) K.u(u, by)du 
0 


by az by 
o:(ax, v)K,(ax, v)dv + f au 
0 0 


0 


I, = of maximum | v)/(uv) | = o(1) 


0<uSaz,0<vSby 


by (C,). This is (2.11). 
As for J2, we have immediately, since ¢ is integrable in Q, 


39 
\ 
| 
| 
} 

Thus 


J. J. GERGEN 


2.2. Lema 2. If (Ci) and (Lz’) hold, then 


(2.21) odv = o(x), 


(2.22) | y)| < Axy. 


We observe first that, corresponding to an arbitrary number 0 < e, we can 
choose 0<ky and 0<6<7/2 so that 


(2.23) | oi(x, y)| < exy for x <6, y S65, 
a(x, y; ko) < ex, B(x, y; ko) <eyforx S56, y 
We observe next that, if x, c, and z are numbers such that 


(2.24) <6, (ko 


we consequently have 


z z (kotl)z 
| few f _¢dv| = f au) f -f + \ ode 
0 c 0 (ko+1)2 koz koz 


2.25) duff | 0)| do +| difx, (bo + | 


+ | oi(x, | 
< + ex(ko + 1)2 + exkoz < 2mex. 


We can now easily prove that (2.21) holds. In fact, if <6 and (ko +1)y 
<6, then (2.24) holds with c=x—y and z=y; and hence, applying (2.25), 


f a f odv| < 2mex. 
0 


Since ¢ was arbitrary, this proves that (2.21) holds. 

As for (2.22), let us first suppose that x <5<y. Then, choosing z so that 
N =(r—5)/z is an integer and so that 0<(k)+1)z<6, and denoting by a 
the largest of the numbers 6, 5+2, - - - , r—z less than y, we have 


N-1 z 6+(n+1)z z y 
| 9) | a) | + f a f + f odo 
0 8 0 a 


n=0 


< €x5 + 2nNex + 


40 [January 

4 


1933] DOUBLE FOURIER SERIES 


upon applying (2.23) and (2.25). Thus, for <6<y, we have 
(2.26) | y) | < Axy. 


Similarly, (2.26) holds for ySé<zx. Accordingly, because of (2.23) and the 
obvious fact that (2.26) holds for <x, 5S, the proof is complete. 


2.3. Lemma 3. If (C;) and (Lr’’) hold and if 0<a,0<b, then 
(2.31) f au f dv = o(1). 
0 aw—by 


Writing 


») = f de f $(o, t)dt, 
0 a—by 


we have 


I; = )K(ax, x) — f K,(u, 
0 


$(ax, v)K,(ax, v)dv + f du f $Ku,dv 
0 ax—by 


aw—by 


= O(maximum | »)/2| ) 


Using Lemma 2 now we conclude the proof. 
2.4. Lemma 4. If (Ci) and (L#’) hold and if 0 <a, then 


az 
Ji(x, y; =f au f oQdv = 
0 ky 


_ 
Q = v; x, y) = Q(u, v) = sin — csc y), 


and 
w(v; y) = 2 csc + y) — csc + 2y) — csc 


Writing 
») = f do 40, 
0 ky 


we have, by Lemma 2, 


| ») | <Auv forO<uSr,ky Svar. 


41 
where 


42 J. J. GERGEN [January 


On the other hand, we have 
(2.41) v*| w(v; y)| < Ay?, w,| < Ay? for y< oS — y;f 
and thus 
uv? | v) | < uvtx| < Ay? 
uv?|Q,| < Ay, uv'x| < Ay 


(2.42) 


Thus 


Ji = o(ax, — 2y)Q(ax, — 2y) — f o(u, — 2y)Q.(u, — 2y)du 
0 


az w—2y 
(ax, v)Q,(ax, v)dv + f au 
0 k 


ky 
= O(y? + y? + 1/k + 1/k) = (1); 
which proves the lemma. ‘ 
2.5. Lemma 5. If (Ci) and (Lz’) hold and if 0 <a, then 


(2.51) = fa = o(1). 


We have, by Lemmas 1, 3, and 4, 


az w—2y ™ 
= f f +2 + \ + a(t) 
0 ky 


(k+l) y (k+2)y 


—JIitJi + = Ji + (1), 


sin — 


az 2Ayo _ 
f au f { sin — dv. 
o sin sin}(v+2y) sin}(v+ y) y 


1 az w—2y 2Ayo 
{ x sin}(v+2y) sin}(v+ y) in} 
= Ofa(ax, 2y; $k)/x + a(ax, y; k)/x} = o(1). 


Since J, is independent of k, the lemma follows. 


¢ Gergen, 4, p. 271. The first inequality in (2.41) is (8.72) and the second is (8.73) of that paper 
with M=A, m=2, v=1, p=0, t=». 


= 
where 
But 


1933] DOUBLE FOURIER SERIES 


2.6. Lemma 6. If (Le) and (Lx’) hold, then 


d 
(2.61). Js =f ao = 
kz 


and 0 and 0 <ky can be found so that 
(2.62) B(x, y; k) < Ay forx S %, ko Sk. 


We first observe that, corresponding to an arbitrary number 0<e, we 
can choose 0 <p and x» so that 


(2.63) 0 < xo(ko + 1) < 2/2, 

(2.64) B(x, y; ko) < ey, x(x, y; ko) < for x S xo, y S x. 
We next observe that, if 

(2.65) (ko 


we consequently have 


c koz u (ko+1)z2 koz koz koz u 

(2.66) wy(x, 2; bo) + B{ x, (ko + ko} 


< me + + < 2m. 


Consider, then, (2.61). If x <a» and (ko+1)yS4%o, then (2.65) holds with 
c=a—y and z=y. Accordingly, because of (2.66) and the fact that 


k) = J2(x, ko) for ko S k, 
we have 
Jo < for x S xo, (ko + 1)y S xX, ko Sk. 


Since ¢ was arbitrary, this proves that (2.61) holds. 
As for (2.62), we observe that, because of (2.64) and the fact that 


B(x, y; k) S B(x, y; ko) for ko S k, 
it is enough to prove that 
B(x, ko) < Ay for x S < y. 


But this is immediate; for, choosing z so that N=(x—2)/z is an integer 
and so that 0 <(ko+1)zS%o, we have 


43 


44 J. J. GERGEN [January 
N-1 
B(x, y; ko) S B(x, ko) S B(x, x0; ko) + 
n=0 Zotnz 


< + < Ay 


for x <2o<y by (2.64) and (2.66). This completes the proof. 
2.7. Lemma 7. If (Le) holds and if 0 <b, then 


f ao f du = o(1). 
w—by 0 


We have, using Lemmas 1 and 3, 


4I; f ao} +2 + f + 6(1) 
k ( 


—by z (k+1) 2 k+2)z 


Ji +J3' + o(1), 


4m 


= f av f { - \ sin — du, 
Sin 40 ae sin}(u+ 2x) sin x) x 


Jj'=- f av u; y, x)du. 
kz 


—by 
Now, as for Jj , we have immediately, upon applying Lemma 6, 
= Of J2(2x, by; $k) + Jo(x, by; k)} = a(1). 


Finally, as for Jj’, upon setting 


—by kez 


| v) | <Au for kx 
by Lemma 2. Thus, noting that (2.42) is applicable to the function 
Q(u, v) = QA(v, u; y, x), 


we have 


where 
we have 
i 
A 


1933] DOUBLE FOURIER SERIES 


Ji' = — 2x, — 2x, r) + o(u, 


kz 


— 2x, — 2x, v)dv — av f Qu, du 
ke 


w—by w—by 
= O(x? + 1/k + x? + 1/k) = O(1). 
This completes the proof. 
2.8. Lemma 8. If (C;) and (Lz’) hold, then 
TU rv 
J;= f w(u; x) sin “du w(v; y)o(u + x, 0 + y) sin— dv = d(1). 
kz x ky ¥ 
First, setting 
w(u, v) = w(u; x) sin (wu/x)w(v; y) sin (wv/y), 
and using (2.41), we have 
w | < Ax*y?, wy| < 
\ 
uty? | w,| < Axty, | < Axy 
Next, setting 
kz ky 
and using Lemma 2, we have 
Thus 


Js = — 2y)w(r — — 2y) — f — 2y) wu(u,e — 2y)du 


kz 


o(x — 2x, v) — 2x, v)dv + du dv 
kz 


ky ky 
= O(x?y? + y?/k + + 1/k*) = O(1). 

This proves the lemma. 
2.9. Lemma 9. If (Le) and (Le’) hold, then 


du 
J,= f | 9) | do f |A.o(u, v + y)|— = o(1). 
ky kz u 


46 J. J. GERGEN 


We have 


— | A.o(u, + y)|— 
ky 


by (2.41). Thus, integrating by parts and using Lemma 6, 
° d 
J,=0O { + B(x, 2; = 6(1). 
ky v 


3.1. Proof of Theorem I. Because of the symmetry of the condition (Lz) 
with respect to the arguments of f, it is plain that Lemmas 3, 5, and 7 hold 
if we interchange the arguments of f in the integrals appearing there. Thus, if 
0 <a, then 


by 
f do f $(u, 0)K(u, 0; x, y)du = o(1), 
0 


by 7 
f au oKdu = o(1), f au dv = o(1). 
0 0 0 


Now it is readily seen by these relations and Lemmas 1, 3, 5, and 7 that 


ka (k+1)z (k+2) 2 


165(x, y= f Bdu + 6(1), 


where 


G= { f +2 + \ 
ky (k+l) y (k+2) y 


Hence, making in each of these integrals a change of variables which carries 
the region of integration into (kx, ky, r—2x, r—2y), and collecting the terms 
properly, we have 


16S = sin au f W(u, v; x, y) sin — dv + o(1), 
k 


kz x 
where 


{ Az.yo(u + x, + y) Az, yo(u,v + y) Az yo(ut x, v) 


sin + 2x) sin + 2y) sin Zusin}(v+2y) sin }(u-+ 2x)sin 40 


Azyolu, v) +2,9+y) A.o(u, v + y) \ 
sin + 2x) sin 
Ayo(u + x,0+ y) Ayo(u + x, v) 
sin + 2y) sin $v } 
+ w(u; x)w(v; y)o(u + x,0 + 
= +S2+53+ say. 


sin sin 


[January 


1933] DOUBLE FOURIER SERIES 


But 


f sin —du S; sin “ie = of f au f | Si| av} 
kz x ky kz ky 
= Ofr(x, y; #)} = a(1) 
and, by Lemma 9, 


Tv 
f sin — du f Se sin — dv 
kz x y 


Similarly, 
w—2y 
f sin “du f Ss da—@ = 6(1). 
k k 7 


z x v 


Finally, 


TU w—2y Tv 
f sin du S,sin — dv = 
kz x ky P 


by Lemma 8. 


Thus, 
s(x, y) = o(1) 


if (Lz) holds. This proves the theorem, since S is independent of k. 
4.1. Lemmas for Theorem II. Lemma 1. Jf (L?) and (L?’) hold, then 


(4.11) av f du = &(1), 


and 0 <7 and 0 <ky can be found so that 

(4.12) n(x, < Ay for x ko Sk. 
Further, if (L2) holds, then 

(4.13) Js(x, y; 1) = o(1). 


The proof concerning (4.11) and (4.12) is much the same as that of Lem- 
ma 6, §2.6. Given 0 <e¢ we can, because of (L?) and (L?’), choose 1 <p» and 
2X9 so that (2.63) holds, and 


(4.14) n(x, y; ko) < ey, ko) < € for x S y S %. 


Thus, if «S20, (ko+1)2S%9S¢<c+z2<7, we have 


47 


48 J. J. GERGEN- 


(x, 2; ko) + x, (Ro + 1)z; ko} /(Roz) 
< we + + 1/ko) < 


Choosing suitable sets of values for c and z in this inequality, and making use 
of (4.14) and the fact that 


y; k) S Is(x, y; ho), n(x, y; k) S n(x, y; ko) for ko S k, 


we deduce easily (4.11) and (4.12). 
As for (4.13), we have 


Js(x, y; 1) -o| f+ au | 
= Off:(x, y) + m(x, 2y)/y} = 01) 


by (Zz). This completes the proof. 


4.2. Lemma 2. If (LP) and (L?’) hold, then (1.61) holds. Moreover, if 
(Le) holds, then (C;*) holds. 


Let A(x, y; &) be the upper bound of 
E(u, v; k)/u + Isa, y; 
for a fixed k for O0<u<x, 0<v<y, and let 
xy = x{k/(k +1)}*, = yf + 
for u=0,1, - - - . Then, for (k+1)4S7, (k+1) yS7,1kh, 


u=0 


kz 
(b+ Dy f du fol de, 
0 


Usnuary 

and 


1933] DOUBLE FOURIER SERIES 


S (k + 1)*xd(ka, y; k) + (k + 1)%0(x). 
Accordingly, 


oi*(kx, ky) S (k + 1)4xy{2r(kx, y; k) + o(1)}. 
Now, if (LZ?) and (L?") hold, then, using Lemma 1, 
y; k) = O(1) 
for some fixed k; while if (Z.) holds, then 
A(x, 1) = off). 
The lemma follows. 


4.3. Lemma 3. If (L?) and (L?") hold, or if (Lx), (Lk’), and (1.61) hold, 
then 


(4.31) oi*(x, y) < Axy. 


The proof concerning (Z/ ) and (L?’’ ), as well as that concerning (ZL ) and 
(L%’), closely resembles the proof of (2.22). We need consider only (L/ ) 
and (L?’). We first observe that, by (ZL? ), (L?’), and Lemma 2, numbers 
0<e, 1<ko, and 0<5<7/2 can be found such that 


o1*(x, y) < exy, E(x, y; ko) < ex, n(x, y; ho) < ey 


for x <5, y<6. We next observe that, if x<6, (ko +1)255S5¢<c+zS7, we 
thus have 


z (kotl1)z 
0 c (ko+1)z koz koz 


< wt(x, y; bo) + x, (Ro + 1)z} /(oz) 
< + ex(1 + 1/hko) < 


Proceeding now as in the proof of (2.22), we deduce (4.31). 
4.4. Lemma 4. If (4.31) holds, then 


Jo = xy —f 
k 


ks u? y y2 


We have, upon integrating by parts, 


J. J. GERGEN 


ar} 
k ke ky 


= O(xy + y/k + x/k + 1/k*) = O(1). 
4.5. Lemma 5. If (4.31) holds, then (L’) is equivalent to (L?’) 


We need consider only a and £. We have 
| do \ 


vf (v— y)o 


— 


= O(xy + x/k) = 0(x); 


and this proves the lemma. 
4.6. Lemma 6. If (Le) and (Le’) hold, then 


v du 
<f | = o(1). 


Moreover, if (L? ) and (L?") hold, then 


We may confine our attention to the second part of this lemma. We have 


Js = Of k) + n(x, 0; 


ky 


= O(y + 1/k) = (1) 


by (LZ? ) and (L?’) and Lemma 1. 
4.7, Lema 7. If (C:*) holds, then (Lp) implies (L:), and (Lp) implies (Ls) 


We may confine ourselves to (Lp) and (Zz). We have 


= x, (k + 1)y} /y+ y; = 


50 


1933] DOUBLE FOURIER SERIES 


by (C,*) and (Lp). Thus, since £, is independent of , 
= o(x). 


Similarly, 
m = o(y). 


Accordingly, (Z/’) holds. 
As for (Li), we have 


= Off: {(k + 1)x, y}/x + m{x, Iy}/y + = 


by (C,*), (Lp), and (L?’). The lemma follows. 
5.1. Proof of Theorem II. We first note the identities 


uv u u u 
u? v v u v uv 


u(u+x)o(v+y) + x)(v + y) 


Az,y? 
(u+x)(v+y) 
We next note that it follows from these identities that 


(5.11) Jot for (k+1)x Sz, 8, 


Jy = dv, 
kz u? ky \ v 
and that 


(5.12) for (k+ sn, (k+1)y 


™du dv 
af —{ |Ayo|—. 
ky v 


Consider, then, the first part of (c). If (Zp) holds, then (4.31) holds by 
Lemma 3, and accordingly, by Lemma 4, 


Js = O(1). 


where 


where 


OLLEGE C = Lic FRAI AR 
LIBRARY 


51 

uv 


52 J. J. GERGEN 


Further, 
Js = 6(1) 


by Lemma 6; and plainly, by the same reasoning as in Lemma 6, 


J» 6(1). 


Thus, by (5.11), (Zp) implies (Ze). But, by Lemmas 3 and 5, (Zp) also 
implies (Z’). The truth of the first part of (c) follows. 
Now consider the second part of (c). If (Lz) and (1.61) hold, then (4.31) 
holds by Lemma 3, and accordingly, by Lemma 4, 
Js = O(1). 
Further, 
J; = 6(1) 
by Lemma 6, and plainly, 
Jio = 6(1). 
Thus, by (5.12), (Ze) and (1.61) imply (Z?’). But, by Lemmas 3 and 5, 
(Le) and (1.61) also imply (Z?’ ). The second part of (c) follows. 
As for (b), the first part of (b) is trivial. The second part is proved in 
Lemma 7. 
Turning, finally, to (a), let us first suppose that (Z;) holds. Then, plainly 
(Le) holds, and thus, by (c), since (Z:) contains (C:*), (Lp) also holds. Ac- 
cordingly, (Zz) holds by Lemma 7. Thus (Z;) implies (Z2). 
Suppose, on the other hand, that (Zz) holds. Then (Zp) holds by (b), and 
accordingly, (Lz) holds by (c). But (Zz) also implies (Ci*) by Lemma 2. Thus, 
by Lemma 7 again, (Z;) holds. This completes the proof. 


6.1. Lemmas for Theorem III. Lemma 1. Jf (Jz) holds, then 
(6.11) f(x, y) = O(1). 

We choose 0 <¢ and 0<6<7/2 so that 
(6.12) W(x, y) < €, W2(x, y) < € for x S 26, y S 26. 
Then we have, since f(4, 5) is finite, 

| Hx, »)| S| ») — £6, ») | +116, ») £6, 8)| +1 £6, 8) | 
W1(6, y) + W2(5, 6) + | 5) | < A 

for x $5, y <6. This is (6.11). 


[January 


1933] DOUBLE FOURIER SERIES 53 


6.2. Lemma 2. If0<a<band if, u being fixed, f(u, v) is finite and integrable 
in v over (a, b+), then 


b d b+y 


In fact, writing 


vo) = f 
y is measurable and we have, if ¥(b+-y) is finite, 


But plainly (6.21) holds if ¥(b+-y) is infinite; and this proves the lemma. 
6.3. Lemma 3. If 0<i<- and if (J1’) holds, then 


™z dy ftv dv 
Ju= f | Az.yf|— = o(1). 
v 


We have, upon applying Lemma 2, 


{ fiw fas 
-0{— = 6(1). 


6.4. Lemma 4. If (J7') and (J2’) hold, then 


a(x, y;k) = d(x). 


We observe, first, that (+0, v) exists for nearly all values of v on (0, 7), 
for f is of bounded variation as a function of u for almost all values of v on 
this interval. 

We observe, secondly, that f(+0, v) is integrable on (0, 7). In fact, if mo 
is any number such that 0 <u» <7, we have 


| v) | < | f(wo, v)| + V(v) for (u, v) in Q. 


Plainly, then, f(+0, v) is the limit function of a sequence of integrable func- 
tions {f(un, v)}, 2, - - - , satisfying a condition of the type 


| f(un, »)| S$ Vo(v) forO 


54 J. J. GERGEN _ [January 


where V, is integrable on (0, 7). The integrability of f(+0, ») now follows 
from a familiar theorem of Lebesgue.* 
We observe, thirdly, that 


To prove this we note that 

B(z, ») = — 14 0,9] dus 
Thus B tends to zero with x for nearly every v on (0, 7), and 


B(x, v) S V(v) for OS 


Accordingly, since B is integrable in v for every fixed positive value of x, 
(6.41) follows from the theorem of Lebesgue mentioned above.t 

We observe, finally, that, upon choosing 0<e and 0<é<z7/2 so that 
(6.12) holds, we have 


z 6 dv 
(6.42) f au f | Ayf | — S xe/k for x S 26, ky S56. 
0 ky v 


This results immediately upon applying Lemma 2. 
The lemma now follows readily. We have 


+ ef | Ayf(+ 0, ») | av} = 6(x) 
0 
as a consequence of (6.41), (6.42), and the well known fact that 
f | A,f(+ 0, ») | do = o(1). 
0 


* Lebesgue, 11, p. 375. The full theorem referred to is to the effect that, if f,(P), m=1,2,---, 
is integrable on the bounded measurable set E, if 
< for n = 1,2,---, Pon E, 
where ¢ is integrable on E, and if 
lim f,(P) 
exists nearly everywhere in E, then the limit function f(P) of the sequence {f,(P)} is integrable on 


E, and 
lim fefu(P)dP = fef(P)dP. 


{ It is clear that the conclusion of this theorem remains the same if we replace the discrete vari- 
able n by a continuous one. 


1933] DOUBLE FOURIER SERIES 55 


7.1. Proof of Theorem III. We first prove that (J7) implies (ZL). For 
this we choose 0 <¢ and 0<5<7/2 so that (6.12) holds; and write 


8 ky kz 8 kz Vky v 
+ Ju, say. 


Now, 
Ju = 


by (J7’) and Lemma 3, and plainly, by the same reasoning, 
Jie = O(1). 


Further, 
Jis = 


as is well known. It remains, then, to consider J14. 
We write 


du dv "8 dy du 
Ju= f + | 
kz uy/z vzly u 


= Jil + Jia” » Say, 


where 


if xs y, if xs y, 


1 if y<x, y/xif y< x. 


Then we have 


= 0 ls { | Auf(m, ») | +] Ayf(u + x, 0) | }— “| 


- 04s = 


upon making use of (6.12) and Lemma 2. In the same way, of course, we get 
Jia = O(1) 
and it follows that (Jz) implies (Lz ). 

The theorem is now immediate. First, since (Jr) implies (Co) with 
s=f(+0, +0), (Jr) implies (Jz) with s=f(+0, +0). Next, because (C;) is 
common to (Jz) and (Za), it is plain, by Lemma 4 and what we have just 
proved, that (Jz) implies (Lz). Thus, since (Jz) implies (1.61) as a conse- 
quence of Lemma 1, it follows from Theorem II, part (c), that (Jz) implies 


56 J. J. GERGEN [January 


(Lp). Finally, making use of the fact that (Jr) implies (Co) with s=f(+0, +0) 
and Theorem II, parts (a) and (b), we find that (Jr) implies (Zi) with s= 
f(+0, +0). This completes the proof. 
8.1. Lemmas for Theorem IV. Lemma 1. If F satisfies (Ju), then the 
limits 
limit F(u,v), lim lim F(u,v), lim lim F(u, 2) 
(u,v) —+(+0,+0) u—-+0 +0 


exist and are equal, and further, if we set 


+ f 


(8.11) 


(8.12) P—N=F 

then both P and N satisfy the following conditions: 

(8.13) OS P<A 

(8.14) 

(8.15) OSAP 

(8.16) O<Az,P 
(8.17) P is integrable in Q. 

The proofs of these facts, with the exception of the last, are given by 
Hardy.* The proof that (8.17) holds can be made to rest on a theorem of 
Young.t Young proves that, if the conditions (8.13) to (8.16) hold, then P 
is continuous at every point in the interior of Q, with the possible exception 
of those points found on a denumerable set of lines, each of which is parallel 
either to the u- or the v-axis. Thus, assuming Hardy’s results, it follows that, 
if a is any constant, the set of points on which P <a consists of an open set 
plus, possibly, a set of zero measure. Accordingly, P is measurable in Q; and 
thus, using (8.13), it follows that (8.17) holds. 

8.2. Lema 2. Jf (8.14), (8.15), (8.16), and (8.17) hold, and if 


(8.21) 0s P<Aw 
then P/(uv) satisfies (L? ) and (L?’). 


* Hardy, 5, pp. 57-59. Hardy defines P and N in terms of the positive and negative variation of 
F, but his definitions are equivalent to ours. Hardy states (8.17) without proof. 
t Young, 16, p. 31. 


1933] DOUBLE FOURIER SERIES 


We have 
™y P 
= d 7 
d 
= ™du ™du du 
= Of ay f =f — — dy 
eget ky ke ong ens ky? 


(k+1)2 dy (k+l) y P 
u? ™—y v3 kz u? ky v? 


= Of1/k? + y/k + + xy + 1/k*} = O(1), 


= O(x/k + xy) = G(x). 


Treating 7 in a similar manner, we conclude the truth of the lemma. 

9.1. Proof of Theorem IV. It is plain that (Y) implies (Yp) and (C,*). 
As a consequence of Theorem II, then, it is enough to prove that (Y p) 
implies (Zp). For this we note that, if (Yp) holds, then the function 


F = 


satisfies (Jz), for, under these circumstances, F is finitely defined everywhere 
in Q and 


f = 0, lero, 


We note, further, that, on defining P and N as in Lemma 1, we have 


| du.o(uof)| + uv| s| < 
0 


Applying Lemmas 1 and 2 now, we see that ¢ satisfies (ZL?) and (L?’). 
Since (C;) is common to (Zp) and (Yp), the proof is complete. 
10.1. Proof of Theorem V. We have, if 1<;, 1< pz, 


57 


58 J. J. GERGEN — [January 


= l/p, du © dy lla 
0 0 ke y 
(—+ 
1 


Of kx) ky) - = 6(1), 


Of Ry) = O(xk-/™) = (x), 


O | iv} = O(1/k*) = o(1), 


a=0O | A,f| ao} = O(x/k) = 6(x) 


if p= f2=1. Treating B in a similar manner and noting that (C:) is common 
to (Lz) and (HL), we deduce the truth of the theorem. 

11.1. Proof of Theorem VI. Since (Jz) implies (Co) with s=f(+0, +0) 
and f is finitely defined everywhere in Q, we have only to prove that f satisfies 
(1.51), (HL’), and (HL"’), the last two with = p2=p3=1. 

Consider (1.51). We have 
| Az.v(uof)| (uw + + y)| + 

+ x(0 + y)| f| 


0b | + oy} ff "| deste, } 


+ xy maximum | | 
OSeSa,0StSb 


for Osu Thus 


+ f \as0, maximum | f| <Axy. 


OSuSx 


and 


1 
q1 
1 1 
pe q2 
|| 


1933] DOUBLE FOURIER SERIES 59 


This is (1.51). 
Turning now to (HL’) and (HL’’), we define P and N as in Lemma 1, 
§8.1. Then, using the results of that lemma, we have 


™y 
f auf |AzyP| dy = f au A.,»Pdot 
0 0 0 0 
v 
= of f au Pav + f au Pas\ 
0 0 


= O(xy), 


| AyP| dv - of = O(xy). 


Treating N and the other integral in (HL’’) in the same manner, we infer the 
truth of the theorem. 

12.1. Lemmas for Theorem VII. Lemma 1. Jf F satisfies (Ju) and is ab- 
solutely continuous* in (a, a; 7, 3) for every 0<a<r, then F coincides in the 
region 0<usSm,0<vS7 with a function G which is absolutely continuous in Q. 

We first define P and NW as in Lemma 1, §8.1, for 0<us7, 0<vS7, and 
note that P(u, +0), P(+0, v), and P(+0, +0) exist, the first for every u and 
the second for every v, on (0, 7). We complete the definition of P by setting 


P(u, 0) = P(u, + 0) for0 < uw < x, P(O, v) = P(+ 0,2) forO Sze, 


P(O, 0) = P(+ 0, + 0). 
We next note that 


"| duP(u, x) | = o(1), f f | = o(1), 


0 


(12.11) 


In fact, upon making use of (8.14) and (8.15), our definition of P on the axes, 
and the appropriate limiting processes, we find that 


* By definition F is absolutely continuous in (a, a; 7, =) if (i) the functions F(u, 0) and F(0, v) 
are absolutely continuous on (a, 7), and if (ii), corresponding to each 0<e, we can so choose 0<8 
that, if {(xz’, ye’; xe’, e’)}, i=1, 2,-++, is any collection of rectangles contained in (a, a; x, 7), 
no two of which have a common interior point, and the total measure of which is less than 6, then 

This definition is equivalent to Carathéodory’s, 2, p. 653, but is different from Hobson’s, 8, p. 346, 
which requires only that (b) be satisfied. 


J. J. GERGEN [January 


0< A,P(u,7) 
0 yr. 


Thus we have 


P(x, 7) = P(O, 7) = o(1), 


= P(x, — P(O, x) — P(x, 0) + P(0, 0) 


P(0, 0) = P(+ 0, + 0) = lim lim P(x, y) = lim P(x, 0). 


y++0 


This proves that the first two relations in (12.11) hold. The last two can, of 
course, be proved correct in a similar manner. 

To complete the proof of the lemma, we now define N on the axes, as we 
did P, in terms of its limiting values, and we set 


G = P—N for (u,v) inQ. 
Then plainly N satisfies (12.11), and therefore so also does G. But 
(12.12) G=F for0<uS7,0<v7, 


and therefore G is absolutely continuous in (a, a; 7, +) for every 0<a<7. We 
conclude from these two properties of G that G is absolutely continuous in Q. 
By (12.12), this proves the lemma. 


12.2. Lemma 2. If g is integrable over Q, then uvg satisfies (L2). 


We have 


™y 
i= f au | Az.yg| dv = o(1), 
z 
z ™y 
[udu f do f f | Aug | dot = o(x). 
0 v 0 0 


Similarly, 
m = o(y). 


12.3. Lemma 3. If h(u) is integrable over (0, 3), and if g(u,v) is integrable 
over Q, then the function uH (u,v), where 


60 
fire, 
f 
0 0 
= o(1) — P(x, 0) + P(O, 0) = o(1) 
since 


DOUBLE FOURIER SERIES 


H(u, ») = — f 


satisfies (Lz). 
We have 


- of, | A. f ae | Asg(u, t) | dt 

- of | i | ash = o(1). 


Further, 


= g(u, t) | as\ = o(x). 


Finally, 


n= of f ao f | A.h| du + f ao au f | A.g(u, 2) | 
0 z 0 z v 
= of f | A.h| du + yf au f | A.g(t, 2) | = o(y). 
0 z 0 


This completes the proof. 

13.1. Proof of Theorem VII. By our hypothesis, F can be so defined on 
the axes as to satisfy (J). Moreover, F obviously is absolutely continuous in 
(a, a; r, 7) for every 0<a<z. Thus, by Lemma 1, F coincides in the region 
0<uSz, 0<v<7 with a function G absolutely continuous in Q. 

Now, since G is absolutely continuous in Q, there exist functions g(u, v), 
h(u), l(v), the first integrable over Q and the last two over (0, x), such that, 
for (u,v) in Q, Gis given by* 


* See Hobson, 8, pp. 592, 615, or Carathéodory, 2, p. 654. 


1933] 61 


J. J. GERGEN [January 


f f f — f + G(x, x) 


hy om + G(r, ™), say. 


We proceed to express f in terms of g, h, and /. 
We have, for 0<u<u+x<nr, 0<v<0+yrz, 


fof = A, 


= (u+ x)(u + y)Azygi + (u + x)yALG + x(v + y)A,G + xyF. 


Hence, upon dividing each member of this equation by xy, setting y =x, and 
letting x tend to zero, we have* 
(13.11) f(u, v) = uvg + uH + + FP, 


where 


H(u, v) = h(u) — t)dt, 


v 


L(u, v) = Iv) — v)de, 


for almost all (u, 2) in Q. 

The theorem now follows. Since g and / are integrable, uvg and uH satisfy 
(Lz) by Lemmas 2 and 3. Similarly, since / is integrable, vZ satisfies (2). 
Applying Theorem II, then, we see that (Z;) holds with f replaced by 
uvg+uH+vL and s, by zero. On the other hand, since F satisfies (Jz), it 
satisfies (VY) with s=F(+0, +0) by Theorem VI. Hence, by Theorem IV, 
F satisfies (Li) with s=F(+0, +0). Combining these results with (13.11), we 
reach the desired conclusion. 


REFERENCES 


1. Adams, C. R., to be published. 

2. Carathéodory, C., Vorlesungen iiber Reelle Funktionen, Leipzig and Berlin, 1927. 

3. Merriman, G. M., On certain theorems regarding summable series and their application to 
double and triple Fourier series, American Journal of Mathematics, vol. 47 (1925), pp. 125-139. 

4. Gergen, J. J., Convergence and summability criteria for Fourier series, Quarterly Journal of 
Mathematics, Oxford Series, vol. 1 (1930), pp. 252-275. 

5. Hardy, G. H., On double Fourier series, Quarterly Journal of Pure and Applied Mathematics, 
vol. 37 (1905), pp. 53-79. 

6. On certain convergence criteria for Fourier series, Messenger of Mathematics, vol. 49 
(1919-20), pp. 149-155. 


* The proof concerning the double differences can be found in Hobson, 8, p. 614, or in Carathé- 
odory, 2, p. 496; that concerning the simple differences, in Hobson, 8, pp. 588, 611, or in Carathéodory, 
2, p. 658. 


62 


1933] DOUBLE FOURIER SERIES 63 


7. Hardy, G. H. and Littlewood, J. E., Solution of the Cesdro summability problem for power 
series and Fourier series, Mathematische Zeitschrift, vol. 19 (1923), pp. 67-96. 

8. Hobson, E. W., Theory of Functions of a Real Variable, Cambridge, vol. 1, 1927. 

9. Theory of Functions of a Real Variable, Cambridge, vol. 2, 1926. 

10. Kiistermann, W. W., Uber Fouriersche Doppelreihen, Inaugural Dissertation, Miinchen, 1913. 

11. Lebesgue, H., Sur P’intégration des fonctions discontinues, Annales Scientifiques de l’Ecole 
Normale Supérieure, (3), vol. 27 (1910), pp. 361-450. 

12. Pringsheim, A., Elementare Theorie der unendlichen Doppelreihen, Sitzungsberichte der 
Mathematisch-Physikalischen Classe der Akademie der Wissenschaften zu Miinchen, vol. 27 (1897), 
pp. 101-153. 

13. Tonelli, L., Serie Trigonometriche, Bologna, 1928. 

14. Sulla Convergenza delle Serie Doppie di Fourier, Annali di Matematica, (4), vol. 4 
(1927), pp. 29-72. 

15. Young, W. H., Multiple Fourier series, Proceedings of the London Mathematical Society, 
vol. 11 (1913), pp. 133-184. 

16. On multiple integrals, Proceedings of the Royal Society of London, vol. 93 (1916-17), 
pp. 28-42. 


HARVARD UNIVERSITY, 
CAMBRIDGE Mass. 


ON THE NUMERATORS OF THE CONVERGENTS OF THE 
STIELTJES CONTINUED FRACTIONS* 


BY 
JACOB SHERMAN 


Introduction. The object of this paper is the study of the numerators of 
the infinite continued fractions introduced by Stieltjes [1, 2].t They are of 
two types: 

(a) the “associated” continued fraction: 


ls lz — ls — cs 


the mth convergent of which will be denoted by ,(z)/¢,(z)=K,(z) (n=0, 
1, 2, 3, adecssags ); 
(b) the “corresponding” continued fraction: 


(2) 


the mth convergent of which will be denoted by U,(z)/V,.(z)=W,(z) (n=0, 
(a) Xj, c; are real constants with \;>0 for i=1, 2,3,---; 
Q(z), 6n(z) are polynomials of degree and  respectively.{ In (b) are 
real constants, b; >0, bei4:b2;>0 for i=1, 2, - - - . Usnse(2), are poly- 
nomials of degree (n+ ¢—1) and (w+) respectively, where 1. For our 
study we shall need certain results from the theory of continued fractions [2] 
and of the so-called moments problem [3]. 

(i) The convergence of the associated and corresponding continued fractions. 
Here of fundamental importance is Grommer’s Selection Theorem [2]: 

From every sequence of convergents of a continued fraction of type (1) or (2) 
there may be selected a sub-sequence, which for all non-real z converges to a 
Stieltjes integral of the form [,.(1/(z—u))dy(u). 

Here y/(z) (and hereafter ¥:(z), Ye(z), - - - ) denotes generally a bounded 
monotonic non-decreasing function with infinitely many points of increase 
in ©), such that all integrals [*.2"dy(z) =a, (“moments”) exist (for 

* Presented to the Society, March 26, 1932; received by the editors April 28, 1932. 


The author wishes to express his gratitude to Professor J. Shohat for many valuable suggestions 
concerning the subject of this paper. 

+ The numbers in brackets refer to the list of literature at the end of the paper. 

t We note that ©,(z) - - does not depend on Az and ¢n(z) =2"+ - - does not depend 
on ;. This will be useful in our later discussion. 


64 


be| Bs} | 


STIELTJES CONTINUED FRACTIONS 65 


n=0, 1, 2, 3,---+), with a>0. We may without loss of generality assume 
y(—o)=0. The continued fraction (1) or (2) is said to be “associated” with 
or “corresponding” to, respectively, the integral {”..(1/(s—«))dy(u) or its 
formal development 

an 


gntl 


In symbols: 


F(z) = = P 


(3) 


( f = an). 


The association (3) means formally [3] 
dy(u) 
(4) 


similarly the correspondence means formally 

dy(u) 

It is known that a continued fraction of form (1) may be obtained from one 
of form (2) by “contraction,” and then 
A 2n(2) = 2,,(2), Bon(2) = $n(2) (n 0, 1, 2, 3, ). 

The “association” (4) shows [2] that the {¢,(z)}* constitute an orthogonal 
set of polynomials with regard to the distribution dy(z), i.e. 
0, 
>0,m=n (m,n =0,1,2,3,---), 


where (— ©, ©) may be “reducible” (see page 70) to a sub-interval (a, 5). 
We might have ¥(x) = /%..p(x)dx (p(x) 20 in then in all our 
formulas dy(x) is to be replaced by p(x)dx. In particular (5) becomes 


=0,m#n (m,n = 0,1, 2,3,---). 


Here we say that {¢,(x)} ={¢,(x; a, b; )} forms an orthogonal set corre- 
sponding to the characteristic function p(x) (p(x) =0) in (— ©, ©)). In some 


* Throughout this paper z represents the complex variable x+y. 


_ 
1 An | Ae | 
> —~ ::: 
gntl Z—C¢ Z— C2 
a’ 


66 JACOB SHERMAN (January 


instances we might have ¥(x)=y(a) for x<a and y(x)=y(b) for x>bd 
(p(x) =0 outside (a, 6)) in which case {".(1/(z—u))dy(u) reduces to 
and (5) becomes 


0, mA nN (m, 0, 1, 2, 3, ). 


The function F(z) =f” .(1/(z—))dy() in (3) is known [2] to be regular and 
analytic in any finite closed region of the complex z-plane which does not 
contain any portion of the real axis. Such a region we call, for brevity, an 
“Q-region” [3]. In such a region 1/F(z) is also analytic. In fact it is readily 
proved that F(z) has no zeros in an Q-region. We now introduce, following 
Hamburger [3], the continued fraction 


| | An-1 | An | 


(6) K,(z, t) = 


Xi, ¢; as in (1), ¢ real, arbitrary; K,(z, ©)=K,:(z). Evidently K,(z, 0) 
=K,,(z). 
We take, from Hamburger [3], the following 


DerFinitIon. The continued fraction K(z) converges “completely” to the 
function F(z) of the type (3) if, for arbitrarily small «>0 and for every Q-region, 
| K,n(z, t)—F(z)| (2 in 2, n=N) for every real arbitrary t, including t=, 
where N depends on € and Q only. 


(ii) The [an]¢-moments problem. By this is meant the determination of 
¥(z) of the above nature, given the real set [a,| (n=0, 1, 2, -- -) of its 
moments. If the set determines ¥(z) uniquely (disregarding additive con- 
stants) the moments problem is said to be “determined” ; otherwise it is said 
to be “indetermined.” In case of an “association,” as given in (4), we say 
that y¥(z) is the solution of the moments problem related to P(1/z) or K(z), 
or, see (5), to the orthogonal set {¢,(x)}. Hamburger [3] has shown that the 
complete convergence of the associated continued fraction K(z) to the in- 
tegral [*.(1/(z—u))dy(u) is a necessary and sufficient condition for the 
moments problem related to K(z) to be determined over the interval (— ~, 
oo), 

Our purpose is the study of the polynomials {Us,(x)}, {Uens:(x)}, 
{ Veny1(x) } and their relations to the orthogonal polynomials {¢,(x) }. Some 
of the results concerning {U2,(x)} have been presented by J. Shohat and 
J. Sherman in [4]. 


| 

1 

4 

| 


1933] STIELTJES CONTINUED FRACTIONS 67 


1. The orthogonality properties of the numerators ©,(x). The study of 
the ,(«) is based upon the continued fraction 
(7) 


ls—ce 


(Ai, ¢; the same as in (1)), whose successive convergents will be denoted by 
P..(z)/Qn(z) (z) (n=0, 1, 2,3,---). 
LemMA 1. Qn4i(z) =A10n(z) (n 20). 
We have for K(z), K’(z) respectively 
(8) = — — An+12n—1(2) (mn = 1, 2,3,---) 
(Qo(z) = O, Qil(z) = Aa, Qo(z) = — c2)); 


(9) =(2- Cn+-1)Qn—1(2) An+10n—2(2) (n = 2,3,4,-- ‘) 
(Qo(z) = 1, Qi(z) = 2 — ¢2). 


The truth of the lemma is seen by comparing (8) and (9). In fact, they repre- 
sent the same difference equation with the initial conditions Qo(z), Q(z) and 
0,(z), %(z), respectively, differing only by a constant factor (A,). Our lemma 
leads to the following important conclusion: 


K'(z) is an infinite continued fraction, the denominators of whose convergents 
are, to within a constant factor independent of n, identical with the numerators 
of the convergents of K(z). 

Lemna 2. K’(z) is associated with one and only one “positive definite” power 

series. 


In symbols 
~ = 
2” 


n=0 


This follows directly from the results due to Hamburger [3], since, in 
K"(z), all; >0 (=1, 2,3, ---). 


Lemma 3. If K(z) converges in a certain Q-region, then K'(z) converges in 
that same region to a function which is regular and analytic in that region. 


* The series 
P(1/z) = >> 
n=0 
is said to be “positive definite” if all the determinants 


An = [ass] >0 for m=1,2,---, 


an 


where a;,;=a;,;. We set also Ap=1. 


68 JACOB SHERMAN [January 


It is known [2] that, if K(z) is convergent, K’(z) must either converge 
or diverge to infinity. The latter is impossible, for, by Grommer’s Theorem, 
we may select a sub-sequence { K,/ (z)} (¢=1, 2, 3, - - - ) of the convergents 
of K’(z) which in the Q-region converges to a Stieltjes integral of the form 


3): 
dyi(u) 
Fi(z) = f 


2— & 


and F(z) is regular and analytic in 2. Hence K’(z) itself converges in that 
Q-region to F,(z). 
Corotiary. If F(z) and F(z), both analytic and non-vanishing in Q, are 
the limit functions of K(z) and K'(z) respectively, then 


(10) F(z) = Sn oe (z in Q). 


Lemna 4. If K(z) converges completely to 
d 
F(s) = f 


then K'(z) converges completely to a certain 


F,(z) 


By the definition of K,(z, ¢) (see (6)) we write 


Ai 
2— — 2) 
Xn 
(Kea = | 3 | 1 


— ce |z — cs — ls — cr +t 


K,(z, = 


(11) 


Hence K,/_,(z, ¢) plays the same réle for K’(z) as Kn-a(z, #) does for K(z). 
Using the definition of complete convergence, 


(12) | K,(z, t) — F(z)| <e (n = N(e, 2), z in Q), 


we are lead through (10) and (11), and with the same z, n, e, ¢, Qasin (12), to 


[Kalz, 


Furthermore, we know that 


STIELTJES CONTINUED FRACTIONS 69 


> 0; (z in Q), 


h 
| K,(2, t) | = iy >0 (z in 2, 2 = N) (by (12)). 


1 | | 
2) F(z) 
is bounded above for all real ¢ (including ¢= ©) and for all zin Q, and thus our 


lemma is proved. 
Combining our lemmas, we are in a position to prove 


TueoremM I. The numerators {,(x)} of the continued fraction K(x) or, 
which is the same, the numerators of the even convergents of the continued fraction 
W (x), constitute a set of polynomials orthogonal with regard to a function (x) 
of the same type as (zx), i.e. 


(13) f Qn(x)2,(x)dpi(x) = 0, mAn (m,n =1,2,3,---). 


THEOREM II. y:(z), as a solution of the moments problem associated with 
K'(z), is uniquely determined if (z) is uniquely determined as a solution of the 
moments problem associated with K(z). 


The proof of Theorem I follows from the fact that K’(z), being of the same 
type as K(z), is, therefore, associated, in the sense of (4), with at least one 
Stieltjes integral of type (3): /°.(1/(z—))dy:(u), which leads to orthogo- 
nality relations (13) similar to (5), and here also (— ©, ©) may reduce toa 
certain sub-interval. 

Theorem II follows from Lemma 4 combined with Hamburger’s necessary 
and sufficient condition for the determined character of the moments problem 
as given above (see page 66). 

We can also state 


THEOREM III. Let (A, L) and (X’, L’) denote the “true” intervals of ortho- 
gonality for {dn(x)} and { 2,(x)} respectively. 

(i) A’, L’) <Q, Z). 

(ii) If (A, L) is finite, then, in general, \’ = and L'’=L. Any sub-interval 
of (A, L) which does not contain an interval of constancy of (x) possesses the 
same property with regard to W(x). 

(iii) If (A, L) is infinite, so is (X’, L’); more precisely, if (h, L)=(A, ©), 
L) (A, L finite) then, respectively, L'= ~ or If (A, L)=(—-™, 
oo), then (d’, L’)=(—~, [4]. 


1933] 
(a) 
Hence 


70 JACOB SHERMAN [January 


By a “non-reducible” or “true” interval of orthogonality we mean [4] the 
interval determined by the limits of the least and greatest roots of ¢,(x) asm 
increases indefinitely. Stieltjes [1] proved, for the interval (0, ©), to which 
we can always reduce the intervals (A, ©), (— ©, LZ) (A or LZ finite), that, if 
the largest root of ¢,(x) approaches infinity with m, then an infinite sequence 
of such roots approach infinity. Since we know that the zeros of ¢,(x) sepa- 
rate those of 2,(x), the theorem is established for the simply infinite interval. 
We need thus consider only the case of (A, L)=(— ©, ©). For this purpose 
we make use of the following results of Hamburger [3]: (x) is defined as a 
weight function of order m relative to the positive definite series 


(a) (x) is a step function with exactly m points of increase, x,,;. Let the 
saltus at such a point equal M,,;(¢=1, 2, - - - , a). Thus 


if 


Vn(x) = 0 (— #< = Ma (%n,1 x< Xn,2), 


(14) 
¥n(x) = (%n,2 ¥n(x) = a 


(b) At least the first (2n—1) moments of the weight function y,(x) are 
identical with those of y(x): 


J = a, (v = 0,1,2,---,2"—1). 


From the theory of continued fractions [1] we know that the zeros of }n+:(x) 
separate those of ¢,(x), i-e., if the zeros of ¢,(«), in order of magnitude, are 
denoted by %n,1, Xn.2, * * » We have 


Xn+1,1 < Xn,1 < Xn+1,2 < Xn Xn n—1 < Xn+1,n Xn n < Xn+1,n+1- 


This shows that the largest zero and the next largest zero of ¢,(x) increase 
with m. Since we are dealing with a true interval of orthogonality, we have, 
by hypothesis, 


(A) t+ ©, (n— 0), 
Let us assume 
(B) Xn.2— (n— ©; or L’, or both, finite). 


We then prove that assumption (B) contradicts the hypothesis (A). 
Consider the relations !2] 


| 


STIELTJES CONTINUED FRACTIONS 


2,,(x) M, 


on(x) imt — 


(15) 
ay, Vv = 0, 1, 2n 1; > 0). 


In particular 


= f x*dyp(x) > 0. 


i= 1 


It follows that 
< a2, (n— ©) (by (A)). 


In the relations which define y,(x), let us take for M,,; and x,,; the same 
quantities as in (15). Then, as was shown by Hamburger [3], (b) is satisfied. 
Furthermore, there is a sub-sequence {y,,(x)} (v=1, 2,---) which, for 
yc, approaches ¥(x) as a limit at all its points of continuity. For brevity 
we shall write y,(x) in place of y,,(x). Then 


= = a (1, SxSit+o). 
rm 


Take x =L’’, any point of continuity of such that <x,,. 
Then 


VAL”) = = a0 — M,, 


and 
lim ¥,(L”) = ¥(L”) = ao — lim M,,, = ao. 


Therefore, since ¥(*) =ao, (x) is constant in the interval (L’, ©) and all 
integrals involving dy(x) would have L’ for the upper limit of integration. 
Hence, x,,,<L’, for all v, in contradiction to (A).* 

In similar manner, using M,; instead of M,,, we can show that the as- 
sumption lim,-. finite, involves a contradiction in 
(—«, \’)). This proves our theorem for, if lim %,,,:1=lim x,,,=© and 
lim x,2=lim x,,1= — © (n—>), then, due to separation, the greatest and the 
least zeros of Q,(x) approach + and —, respectively. Henceforth we 
shall, in general, denote by (a, 5) the true interval of orthogonality, finite or 
infinite, for the set {¢,(x) }. 

* It is known that, if 

=0, (m, 1, 2, ++ +) 
then all zeros of ¢,(x) (n=1, 2, + - -) are real, distinct and between a and 6. 


1933] 71 


72 JACOB SHERMAN [January 


2. We now turn to the corresponding continued fraction W(z) as given in 
(2) and its convergents U,(z)/V,(z). Hereafter we assume, with Stieltjes 
[1], that b;>0 (é=1, 2, - - - ). Then the corresponding Stieltjes integral is 
of the form 


2 


((0, ©) perhaps being reducible to (a, b) with a>0 [1, 2]). Since we get the 
continued fraction (1) by contraction from (2), Uen(z) = Qn(z), Ven(z) 
(n=0, 1, 2, - - - ), we shall restrict ourselves to the odd convergents U2n_:(z) 
/Ven-1(z) (n=1). From the difference relations 
U2n—1(2) 2U 2n-2(2) = bon—1U on—s(2), 
= — bon—2U (n = 2, Up = 0), 
with similar expressions for Ven-1(z), Ven-2(z), we derive easily 
(17) Uon—r(2) = [z (ben—1 + bon—2) | Ton—3(2) = Don—2ben—3U on—s(2) 
[n > 2, U,(z) = bi; U;3(z) = bi(z bs) |, 
(18) V on—1(2) [z (bon—1 + bon—2) bon—2b2n—3V on—s(2) 
[n > 2; Vilz) = 2; Va(z) = 2(2 — be — bs) }. 


Since U2,-:(z) and V2,_:(z) are polynomials of degree (n—1) and 1, respec- 
tively, with V2,_:(0) =0, let 


(19) Uen—1(2) = Von—i(z) = 25n-1(2) (m 2 1) 


(Q,-1(2), Sn.+(z) are polynomials of degree »—1). The difference equations 
(17) and (18) then lead to the following continued fractions of type (1): 

bs | | | 
—bs — be — 


(20) K"(z) = 


with convergents P,,(z)/Q,(z), 


bibs | bab, | bsbe | 
ls —be— bs |z — bg — bz 


(21) K'"(z)= 


with convergents R,(z)/T,(z). 
Consider first K’’’(z). Compare S,,(z) as computed from (18) and (19) with 
T,,(z) as computed from (21) (making So(z) = To(z) =1). We see that 


(22) Sa(z) = T,(2) 


(n = 1,2,---). 
4 


1933] STIELTJES CONTINUED FRACTIONS 73 


Moreover, all the b,(i=1, 2, - - - )being positive, K’’’(z) may be formally 
associated with a positive definite series, P:(1/z)>-.8;/z‘t) (in the sense of 
(3)) the coefficients, 8;, of which may be computed through known formulas 
[2], and also with a Stieltjes integral [> (1/(z—u))dy2(u) of type (16), so that 
f dy2(u) > K'"(z) (« f 

0 i=o 2° 0 


Hence, by the fundamental property of association (see (4), (5)), 


=0 (mAn;m,n =0,1,2,---). 


Let 


(23) 


be the series associated with K(z), as given in (1), which, as was stated, is 
obtained by contraction from W(z) in (2). Hence, since [5] 

(24) An = bon—2ben—1, Cn = + don (m 2); Ar = bi, = be, 
K(z) can be re-written as follows: 

| bobs | bibs | 


K(z) = 


We have then for the a; [2] 
ay = Ar = Dy, ay = = Dybo, ae = Aa(A2 + C1") = + +. 
In a similar manner we can express the f; related to the continued fraction 


K’’’(z). A simple comparison of the values of a; and §; thus obtained gives 
the relation 


(25) Bi = (i = 0, i, 2,---). 
But, according to Stieltjes, 


0 


im = » gnotin (0,2), 


a= f (i =0,1,2,---), 


and (25) shows that a solution of the [8;];-moments problem is y2(x), deter- 


mined by the relation 
dyp2(x) = xdy(x). 


1 aj 
| 


74 JACOB SHERMAN [January 


Combining (19) and (22) we get the following result :* 


f = 0 implies f = 0 
0 0 


(m ~ n;m,n = 0,1,2,---). 


We now turn to K’’(z), as given in (20). The very form of (20) shows, by 
reasoning similar to that employed above, that 


(26) f = 0, min (m,n =0,1,2,---). 


Comparison of K’(x) and K’’(x) (see (7), (20), (24) combined with what we 
know about the distribution of the roots of 2,(x)) shows that 


Furthermore, if we compare the coefficients of the power series of type (23), 
associated with K’(z) and K’’(z) respectively, we find again, as above, that 
we may take in (26) and (26a) 


xdy3(x) = dy,(x), 


i.e. the relations (26) may be rewritten as 


f =0, (m,n=0,1,2,---). 
0 


Let, further, 
| bsbe | 


Kiv = a= — 


Then, following the method used in Lemma4, it can be shown that if K‘"(z) is 
completely convergent, so is K’’(z), and that the complete convergence of 
K(z) implies that of K’’’(z), which in turn implies the complete convergence 
of Ki*(z). Hence, the complete convergence of K(z) implies that of K’’(z). 
The discussion of the intervals of orthogonality is similar to that of the 
previous case. We may, then, combine our results into the general 


THEOREM IV. A Stieltjes continued fraction of type (2) 
(b; > 0; = 1,2,3,-°-) 


* This result was first obtained, in an entirely different manner, by J. Shohat [6]. 


| | 
1 


1933] STIELTJES CONTINUED FRACTIONS 75 


with the convergents U (z)/V(z) (¢=0, 1, 2, - - - ) gives rise to four sets of orthog- 
onal polynomials of degree 0,1,2,---: 


(a) f Ven()Von(2)dy(x) = 0, 


(b) J = 0, 


f 2dp(2) = 0 (suc = Solz) = 1), 


1 
(a) f = 0 (Qn(2) = Usnss(2)) 
; (m # n; m,n = 0,1, 2,3,---). 


If the moments problem related to the set { n(x) } ={Von(x)} is determined then 
the moments problems related to the other three sets are also determined. 


We notice that the main feature of our proof consisted in constructing 
continued fractions of the type K(z) for which each of the sets of polynomials 
under discussion are the denominators of the successive convergents. 

3. Differential equation for 2,(x). If dy(x)=p(x)dx, and p(x) is of the 
form 


(27) p(x) = [] (x — (A; > — 1; a; = const.; Q(x) a polynomial) 
t=1 
then it has been shown by J. Shohat [7] that 


> p(y) 
(28) F'(x) = T(x)F(x) + R(x) F(x) = a 
(T(x) =p’ (x)/p(x), R(x) a rational function), and that the corresponding 
orthogonal polynomials satisfy a homogeneous linear differential equation 
of second order 


(29) An(x) yn’ (x) + Ba(x) yn (x) + Ca(x) = 0 
(A,(x), B,(x), C,(x) polynomials). 
In particular, for the classical orthogonal polynomials, of Jacobi, Laguerre 
and Hermite, p(x) is of type (27) and (29) assumes a very simple form. In 
fact A,(x), B,(«) are polynomials independent of , of degree <2, 1, respec- 
tively, while C,, is a constant depending on m only. They are as follows: 


JACOB SHERMAN [January 


interval of 


orthogonality An(x)= Bn(=)= B(x) Cu(2)=Cn 


p(x) 
(x—a)** (6-2) 
(a, B>0) 


Jacobi (J) (a, b) finite | (x—a) (6—x) ab—Ba—(a+8)x n(a+B8+n—1) 


(a>0) 


Laguerre (L) 


Hermite (H) | @) | 2n 


Furthermore, 


B 
(31) p(x) = (1/A(x)) exp sz de. 


In order to derive a differential equation for 2,(*) we need one more relation 
from the theory of continued fractions [2]: 


” 


(32) ba(x)F(x) = Oq(2) + (R(x) = #0). 


Moreover, [7], for p(x) of the type (27), Rn(x)/p(x) is a second solution of the 
differential equation (29). From (32) we find 


Ralx)/p(x) = — Qn(x)/p(x). 
Substituting this expression into (29), we find the desired equation after a 
somewhat lengthy but simple computation: 
(x) + [Ba(x) —2A n(x) p"(x)/p(x) (x) + [Ca(x) — Ba(x) p’(x)/p(z) 
— A,(x)p’"(x)/p(*) + 2A n(x) p'(x)?/p(x)? 
= [An(x)R'(x) — (An(x)p'(x)/p(x))R(x) + Bn(x)R(x) lon(x) 
+ 2A,(x)R(x)n (x). 
(33) can be very greatly simplified for the classical polynomials (30). We show 
that here R(x) has a very simple expression: 


(34) R(x) = (A(x) — + C1)/A(x) = /A(x) (see (36) below). 


(33) 


Using the expression for p(x) from (31), (33) becomes 


A(x)Q,! (x) + [2A"(x) — B(x) (x) + [A’(x) — B(x) + C,]2,(2x) 


(35) = [A(x)R(x) + 2A(x)R(x)o0 (x) = 1,2,---). 


We now let  =1, and observe that 2,(x) is a polynomial of degree m—1, and 
that A(x) and B(x) do not depend on m. Making the leading coefficient of 
Q,(x) equal 1 we obtain 


76 

4 


STIELTJES CONTINUED FRACTIONS 


(A(x) — B'(x) + C1) = [A(x)R(x) — 1) + 2A(x)R(x). 
Let (see (30)) 
(36) A" (x) — B’(x) +C, = ka (constant independent of x). 


We have 
[A(x)R(x) — + 2A(x)R(x) = ha, 


and this we treat as a differential equation in A (x)R(x), which gives 
D 
(37) A(x)R(x) = + k:/2 (D=const.). 
(x — 
It is now easy to show that D =0. In fact, substituting (37) into (35), we get, 
for n =2, on the right side 
— 2D/(x — + 2[D/(x — + (x) 
= (x) — [2D/(x — 1)*][b2(x) — (x — (x)], 
and this can reduce to a polynomial (as in the left side of (35)), if and only if 
D=0. 
We thus proved (34), which, substituted into (35), gives the following dif- 
ferential equation for ©,(x) in the classical cases: 
(38) A(x) (x) + [2A"(x) — B(x) (x) + bn Qa(x) = Rida (x) 
(kn = A" (x) — +C,; = 1,2,---). 


The coefficients of the differential equation for the numerators are thus seen 

to be the coefficients of the adjoint of the differential equation (29) for the 

denominators. 

We shall write equation (35) explicitly (see 30): 

(39) J:(* — a)(b — x)Q/’ (x) + [2(a + b) — ab — Ba + (a + B — 4)xJQ, (x) 
+ [(a + B)(m + 1) + — 1) — 2]2,(x) = 2(@ +B — (x) [8]. 

(40) (x) — (a — 2 — (x) + (m+ 1)0q(2) = (2). 

(41) (x) + (x) + 2(m + 1)2,(x) = 4¢n (x). 

Making use of the simple expression of R(x) as given in (34), we can find 

another expression for F(x) from (28): 


ky dx 
(42) F(x) = p(x) sf + c| (C = const.). 
Remarks. (i) If we differentiate (38) twice and, where y,=¢,(x), (29) 
once, eliminate ¢,‘(x) (i =0, 1, 2, 3), we obtain for 2,(x) a linear homogenous 
equation of the 4th order. 


1933] 77 


JACOB SHERMAN 


(ii) The general solution of (38) is 
Qn(x) + Dip(x)on(x) + DoR,(x) (D;,2 = const.) 
for [7] p(x)¢.(x) and R,(x) are solutions of 
A(x) yn’ (x) + [2A’(x) — B(x) (x) + [A”(x) — B(x) + Ca] yn(x) = 0. 


4. The differential equations for { 2,,(x) } of Laguerre and Hermite cases 
as limiting cases of that for the { 2, (x) } of the Jacobi case. Following a 
method indicated by P. Appell and J. Kampé de Fériet [9], we write the 
differential equations for the { 2,(x) } for the Jacobi case in the interval (0, 1) 
and the characteristic function x*-! (1—x)-! or ,[x; 0, 1; 
x(1 — (x) + [2 -—a+ (2 +B — 4)x]Q, (x) 
(43) + [(@ + 8)(m + 1) + n(n — 1) — 2]2,(x) 
2(a + 1) x; 0, 1; x)8]; 
since in the classical cases generally ¢,’ [x; a, b; p] =n@,/_ [x; a, b; Ap] where 


A=A(zx) is the coefficient of ¢/’ (x) in (29). In (43) let B—1=s, x=x,/s, 
divide by s¥/? s@-«/2 (@+s—3) and then let s+. We thus obtain 


(3) + (2 — — (41) + (n + = 26, (x1) 


which is the differential equation (40) for the @,(+:) in the Laguerre case. 


Again, write the differential equation for 2,[x; —1, 1; 


(1 — (x) + (a+ — 4)x]Q, (x) 
+ [(@ + B)(m + 1) + n(m — 1) — 2]2,(x) = 2(a + B — 1), (x); 


let a—1=8—1=s, x=x,s~-"?, divide through by s"4(s—1), and let soo: 
(x1) + (x1) + 2(m + 1)2,(%1) = (x1), 


which is the differential equation (41) for the @,(+:) in the Hermite case. 

Hence, the numerators as well as the denominators of the associated con- 
tinued fractions of Laguerre and Hermite cases may be considered as limiting 
cases of those of the Jacobi case. 

5. Relations between {¢,(x)} and {Q,(x)}. (i) It is evident, from what 
has been said before, that we may write the associated continued fraction 
K'(x), having { ©,(«)} as denominators of its convergents, directly from the 
given K(x). Also, given the corresponding continued fraction W(x), we may, 
from (20) and (21), write the continued fractions of type K(x) where U2,(x), 
Uensi(x), (1/”) Vensi(a) will play the same réle as V2,(x). Let 


78 [January 
| 


STIELTJES CONTINUED FRACTIONS 
bn(x) = — +--+; 
= — + --- 
We know [5] that S,=)-7_, c; and since K’(x) is obtained by raising the 
indices of the \,’s and c;’s by 1, it follows directly that 
(44) On = t (s = 1,2,---). 


Let an, pn (n=0, 1,---) represent the “normalizing factors” for the sets 
{dn(x)}, { 2.(x)} respectively, ice. 


f = f = (m,n = 0,1, 2,---), 


Gn(X) = Anda(X), = 
We know [5] that 
a; = (Aide Anti)! (n 0, 1, 2, ). 


It follows then, in the same manner as for (44), that 


Pn = ge (n = 0, 1, er ) 
if we choose the first partial numerator in K’(x) equal to \2, which can be 
done without any loss of generality. 

(ii) Relations between the moments of the distributions dy(x) and dy,(x). 
Introducing proper constant factors, we can choose (x), ¥:(x) in (3), (13), 
so that ao=f".dy(x) =1; Bo=f~..dyi(x) The “associations” 


F P(t i "std 
(2) = Pa/s) = (« f a ) 


= = K’(x) ( i 6 = ) 


t=O 


lead to the formal relation F(x) =(x—c,—F,(x)-! from which we have the 
following relations between the a; and §;: 


0 O | ay 
0 


80 JACOB SHERMAN [January 


(iii) The symmetric case. Let the {¢,(x)} be a set of “symmetric” orthog- 
onal polynomials, i.e. 


=0, = x) (2 =1,2,---). 


Then from the similarity of the difference equations satisfied by {¢.(x)} and 
{ 2,(x)} (see (8), (9)), we conclude that the { 2,(x)} also form a symmetric 
set. However, as may be surmised from the fact that the set {¢,(x) } involves 
one more essential constant (c:) than does the set { 2,(x)}, the symmetry of 
the set {¢,(x)} is not necessary for that of the set { 2,(x)}. In fact, if we 
take the Jacobi case in the interval (—1, 1), then, from the general formula 
[10] 


fn = (a — B)(a + B — 2)/[(a + B + 2n — 2)(a + B + 2n — 4)] 
(n > 1; = (a — B)/(a+8)), 


we see that, if a+8=2, then c,=0 (n=2, 3,- +--+), but Thus, 
in this case, the set { 2,(x) } is symmetric while the set {¢,(x) } is not. 

6. Some particular classical cases. In this section we shall study some 
particular classical cases in which we obtain an explicit expression for p(x), 
the characteristic function for the set { 2,(x)}, also some simple relations 
between {¢,(x)} and { @,(x)}. 

(i) The sets {dn(x)} and {,(x)} are identical (disregarding constant 
factors), i.e. 


(45) On(X) = (n = 0, 1, 2,--- ). 
For the sake of simplicity we consider the case of symmetric {¢,(x)}. Our 
hypothesis leads to 

F(x) = D,/(x — DeF(x)) (D1i,2 = const.). 
Solving this equation for F(x) and choosing a proper sign for the radical and 
proper value for the constants D;2,* we obtain 


1(1 — dy [10]. 


F(x) = + (x? — = f 


* The formula 


F(x) = 


shows that for | «| —«, F(x)—+0. On the other hand, in the continued fraction 


the ¢n(x) do not depend on D,, the 2,(x) do not depend on D, and contain D, as a factor. Hence 
having D, we can choose D; so as tu have D,D,=1. 


lx |x 
| 
| 


1933] STIELTJES CONTINUED FRACTIONS 


Hence 


(0,8) = (1,1); va) = f “(1 — 2) p(x) = (1 — 


and the above is the only case in which, under the given conditions, (45) is 
satisfied. The explicit expression for @,(x) is (sin @)/sin 6 (x = cos @).f 

(ii) on (x) = O,(x) (n=1, 2, - - - ) (within a constant factor). The differen- 
tial equation (38) for the &,(x) where we substitute now ¢,/ (x)/n= Q,(x) 
gives 
(46) (x) + (2A’(x) — B(x))Q) (x) + — Ci)Q,(x) = 0. 
Applying (31) to (46), we obtain 

pi(x) = 1/p(x). 


But we know that the characteristic function for ¢,’ (x) is A (x)p(2). 
Hence, 


(47) pi(x) = 1/p(x) = A(x) p(x); p(x) = A(x)-*?. 


In view of (30), the only case when (47) holds is the Jacobi case for a =B =1/2 
where the ¢,(x) are the so-called “trigonometric polynomials” cos arc cos x 
and 


(x = cos 8). 


The characteristic [10] function for the {¢,(x)} is, in the interval (—1, 1), 
(1—x?)-2 and the characteristic function for the {2,(x)} is, by (47), 
(1—a2)12, 

It is interesting to note that the 2,(x) so obtained are the same as in (i), 
for ~:(x) obtained from (47) is identical with that found in (i). 

(iii) Q(x) of Jacobi case, in the interval (—1, 1) with a+B=1, i.e. with 
p(x) =(1+«)*-(1—x)-* (1>a>0). The differential equation (39) for 2,(x) 
of the Jacobi case in the interval (—1, 1), with arbitrary a, 6 >0, is 


(1 — (x) + [—a+6+ (a+ 8 — 4)x]Q,/ (x) 


(48) 
+ [(@ + B)(m + 1) + n(n — 1) — 2]2,(x) = 2(a + B — 1)ox (2). 


Tt Here, disregarding constant factors, 
Qn(cos 6) sin (x+ (x? — 1) 42)" — (x — (x? —1) 2)" 
gn(cos 6) sin (n-+1)0 1) 
If we take | x] >1 and let n— © we get in the limit (x-+(x?—1)"*)-!= F(x)/ in accordance with 
Markofi’s theorem. 


81 
sin 0 
2,(x) = —— 
sin 0 


82 JACOB SHERMAN» [January 


The case a+ =1 deserves special attention, for then (48) becomes a homo- 
genous differential equation of the same type as that for ¢,(x): 
(1 — + [1 — 2a — (x) + + 1)(m — 1)2,(x) = 0 


(49) (m =1,2,---). 


Comparing (49) with (30) (where , a, 8 are replaced by n—1, 2—a, a+1 
respectively) we conclude that the ©,(x) are identical with the ¢,(x) cor- 
responding to (1+x)!-*(1—x)*=1/p(x). For the interval (0, 1), making use 
of Theorem IV, we thus obtain 

THEOREM V. /f in the continued fraction W(x), given by (2), with the con- 
vergents U(x)/Vi(x) (i=0, 1, -- - ), the set {Von(x)} is orthogonal in (0, 1) 
with the characteristic function x*-"(1—x)-* (1>a>0), then the other three sets, 
{(1/x) Vensi(x)}, {Uen(x)}, { Uensi(x) }, are orthogonal in (0, 1) with the char- 
acteristic functions x*(1—x)-*, x!-*(1—x)*, x-*(1—<x)*, respectively. 

The trigonometric polynomials considered above in (ii) are evidently a 
special case a =8 = 1/2. 

(iv) Using the difference equations ((17), (18)) and the fact that K’’’(x) 
has, for denominators of its convergents, polynomials orthogonal with respect 
to the distribution xdy(x) (6, =1), we obtain 


(50)  Van(xd(x)) + Usn(xdp(x)) = (b1 = = 1,2,---). 
In fact, +Ri(x) =Qi(x) (¢=0, 1), and T,(x) and R,(x) satisfy the same 
difference equation, except for initial conditions, as Q(x); then 

R,(x) + T,(x) = Qn(x) (nm = 1,2,---). 


7. Expansion of {@,(x)} in terms of {¢,(x)}. Our starting point is the 
integral representation of 2,(x) [2], 


b 
= f — On(y) 
a 


We find easily, writing 


on(x) — only) 
Ln = 1, 
Ln = ¥ — Cn, 
Lnn—3(¥) = (¥ — Cn—1)Lnn—2(¥) — 
Ln = (¥ — Cn—2)Lnn—3(¥) — 


= Ln n—1(Y) On—1(%) + Ln 


4 


1933] STIELTJES CONTINUED FRACTIONS 83 
We may thus exhibit L,,,-::(y) as the denominators of the ith convergent 
of the continued fraction 

1 | Ae | 


ly |y— ly — Cn—2 


(51a) 


whose relation to K(y) is obvious. These denominators may also be expressed 
as determinants [2]. We have 


y— Cn An 0 


and L, »-i(y)is formed from L,,,o(y) by dropping the n—i last rows and columns. 
Our expansion is then 


2,,(x) 


b b 


Another method for obtaining the expansion under consideration is furnished 
by the differential equation (38). This method, first applied by Christoffel 
[8] to the Legendre case, may be epitomized as follows. Substitute in (38) 


n 


(52) on (x) 


i=1 
(53) 2,(x) = (gn,n—i) = const.), 
i=1 


and equate the coefficients of ¢,_;(x) (for an expansion of a polynomial in 
terms of ¢;(x) is necessarily unique). This leads to linear relations involv- 
ing the gn,n,-; and the A,,,-;. These become especially simple in the cases 
which follow. 

(i) Hermite case: p(x) =e-*. This is the simplest case, since ¢,/ (x)= 
Non—(X), ** * Carrying out the indi- 
cated computation, we obtain 


= (n = 2,3,---), 


JACOB SHERMAN, [January 
2, (x) DA n 
t=0 
(54) An n—come1) = [(— 1)"/2™](n — m — 1)(n — m — 2) 


—---(n — 2m) 


(ii) Laguerre case: p(x) =e-*. If we write out the explicit expression of 
Laguerre polynomials 


we readily obtain the coefficients in the expansion (52) by direct computation: 
=-(— 1)**'2!/(n — i)! 
Using this result we proceed as in the previous case: 
2, (x) n n—iPn—i(X) 


2) Ana], 


2(n — n! 


2n —2 L(n—2)! 


n! 


(n= 


8. Relation between (x) and ¥:(x). We assume that K(z), hence K’(z), 
are related to determined moments problems whose solutions are ¥(x) and 
¥:(x) respectively. In order to establish a relation between these two func- 
tions we make use of the following formula (Perron [2], p. 372): 


v(x — 0) +¥(x +0) ¥(%o — 0) + + 0) 
2 2 y=+0 Tt Zytiy 
= x+ iy,a < x, x < Dd), 


where the path of integration is a straight line parallel to the x-axis. If we 
apply the same formula to ¥:(«) and express the result in terms of y(zx), 
using (10), we obtain, since 


n—1 
(m =1,2,---, 
2 
n?(n — 2)? 
Ann—-2 = — 2(n — 1), 
55 
2(n — 3 | 
2n — 3 L(n— 3) 
J 
4 


STIELTJES CONTINUED FRACTIONS 


1 
lim r\— f = 0, 
y=+0 zotiy 
the fundamental relation 
¥i(x — 0) + ¥i(x + 0) a ¥i(x%o — 0) + ¥i(xo0 + 0) 
2 2 
i ztiy 1 
= lim d < xo, x <b). 


We may change, if necessary, the values of ¥:(x) at its points of discontinuity, 
so as to have on (a, 6) ¥i(x)=3(¥i(a—0)+y1(x+0)). Then formula (56) 
becomes 


(56) 


(57) ¥i(x) — = lim (a < xo, x <b). 


If the explicit expression of the function 


G(s) = ay(u) 


which, for z not in (a, b), coincides with F(z), is known and is such that 1/G(z) 
is regular analytic inside (a, 6), with perhaps a finite number of singularities 
therein, we can obtain the explicit expression for ¥(x), using (57), as follows: 
we choose x and 2%» sufficiently close to x so that 1/G(x) has no singularities 
on the whole segment (2, x). We have then, by Cauchy’s Theorem, 


i dx 4 dy + é ds dy 
G(x) Jo G(xo + iy) G(z) Jy G(x + iv) 
Hence 
i pttiv dz dx 
G(z) Jz, G(x) 


Now let y>+0. The last two integrals will approach zero, for 


(58) 


aah 


i” dy 1 


in the rectangle under consideration. Thus, combining (57) and (58): 


1933] 85 


86 JACOB SHERMAN [January 


(59) — = f 


It follows that, if 1/G(x) is regular analytic over the whole interval (a, d), 
with the possible exception of the end points, (59) holds for any interval 
(xo, x) ¢ (a, b). Then 


1 
(60) ore xt (a<x <b) (dls) = 


We shall illustrate these considerations with the following examples: 
(i) (a, 6)=(—1, 1); =(1—2*)-"*. Here 


1 dy 
-1 (1 — — y) 
The above considerations are applicable and give 


(x? — 1)1/2) 


= x(x? — 1)-1, 


G(x) = 
= = (1 x2) 


which agrees with the results obtained above (page 81). 
(ii) (a, b) =(—1;1); p(x) —x)8-!, a, positive integers. Here 


7“? 


1 1 1 —_— 
aw = f G+ — 9) 

G(x) = Q(x) + (1 + — log [(x + 1)/(x — 1)], 
where Q(«) is a polynomial with real coefficients, of degree <a+8—3, ora 
constant if a=8=1, which can be written at once by applying the binomial 
theory to the integrand in (61). Hence, we can again apply (60), with the 
result 


(1+x)(1 — 


(62) pi(x)= 


—x 
Examples: 
1+ x\? 
pi(x) = (1 + 4—2e+ (1+ + (1 + x)4x? 
(a 3, 8 1, (a, b) (- 1, 1)); 


1 3 
(63) p(x) = | (log | ®. 


4 


1933] STIELTJES CONTINUED FRACTIONS 87 


An expression similar to (63) was obtained, though in a different manner and 
for another purpose, by T. Carleman [11]. 

(iii) (a, b) =(—1, 1), p(x) (1>a>0). From (36) and 
(30) we see that in this case k; =0, so that (42), (60) yield here 
(64) F(x) = cp(x) (c = const.), 
(65) pi(x) = 1/p(x) (within a constant factor). 


This agrees with the result on page 82, obtained there in an entirely different 
manner. If, for example, we take a=8=1/2, [10], then 
1 dy 
F(z) = = (x8 1)-1, 
-1 (1 — — y) 


and consequently by (65) p:(x) =(1-+2?)"/? (see page 81). 


BIBLIOGRAPHY 


1 Stieltjes, Recherches sur les fractions continues, Oeuvres, vol. 2, pp. 398-566. 

20. Perron, Die Lehre von den Kettenbriichen, 2d edition, 1929, Chapters VII, VIII, IX. 

3H. Hamburger, Ueber eine Erweiterung des Stieltjesschen Momentenproblems, Mathematische 
Annalen, vol. 81 (1920), pp. 234-319, vol. 82 (1921), pp. 120-164, 168-187. 

4 J. Shohat and J. Sherman, On the numerators - + + , Proceedings of the National Academy of 
Sciences, vol. 18 (1932), pp. 283-287. 

5 Jacques Chokhate (J. Shohat), Sur le développement de Vintégrale 


f >p(y)dy 
a 


Rendiconti di Palermo, vol. 47 (1923), pp. 25-46. 

6 J. Shohat, On the Stieltjes continued fraction, American Journal of Mathematics, vol. 54 (1932), 
pp. 79-84. 

7 Jacques Chokhate (J. Shohat), Sur une classe étendue de fractions continues - - - , Comptes 
Rendus, vol. 191 (1930), p. 988. 

8 Christoffel, Uber die Gaussische Quadratur - + + , Crelle, vol. 55 (1858), pp. 61-83, where the — 
differential equation for the Legendre case a=8=1 is obtained in an entirely different manner. 

®P. Appell et J. Kampé de Fériet, Fonctions Hypergéométriques et Hypersphériques. Polynomes 
d’ Hermite, Paris, 1926, p. 338. 

10 C. Possé, Sur quelques Applications des Fractions Continues Algébriques, St. Petersbourg, 1886. 

11 T. Carleman, Sur la résolution de certaines équations intégrales, Arkiv fér Matematik, As- 
tronomi och Fysik, vol. 16 (1922), No. 26. 


UNIVERSITY OF PENNSYLVANIA, 
PHILADELPHIA, Pa. 


THREE-DIMENSIONAL MANIFOLDS AND THEIR 
HEEGAARD DIAGRAMS* 


BY 
JAMES SINGER 


INTRODUCTION 


One of the outstanding problems in topology today is the classification 
of n-dimensional manifolds, m= 3. Poincaré, the founder of modern analysis 
situs, devoted several papers to it and allied problems. Heegaardf, in a paper 
concerned primarily with another aspect of the subject, found it convenient 
to construct a pseudo-normal form for a 3-dimensional manifold, a form 
which we now call the Heegaard diagram. Dehn§ and Veblen|| gave modifica- 
tions of his construction. 

The Heegaard diagram of a 3-dimensional manifold consists of a closed 
2-dimensional manifold upon which are drawn a certain number of non- 
intersecting simple closed curves. Any diagram is an adequate representation 
of a 3-dimensional manifold in the sense that it completely determines such 
a manifold, but, unfortunately, a 3-dimensional manifold gives rise to an 
infinity of diagrams. The problem of classifying manifolds is thus transferred 
to the problem of classifying diagrams. 

Heegaard, in the paper cited above, studied (although not completely) 
the modifications that can be made on the curves and surface of a diagram 
which transformed it into another diagram but yet did not change the mani- 
fold which it represented. In this paper we extend Heegaard’s results and 
study more completely the relationships between manifolds and their dia- 
grams. 

We begin then (Part I) by introducing the notions of a canonical region, 
canonical surface and canonical curve of a manifold. The Heegaard diagram is 
then constructed from a canonical surface and curves. Then, before proceed- 
ing to a discussion of manifolds and their diagrams, we show (Part II) how 
to read off the usual invariants of a manifold from any one of its representa- 
tive diagrams. 

* Presented to the Society, October 29, 1932; received by the editors June 10, 1932. 

t Poincaré, H. For references see his paper in the Rendiconti del Circolo Matematico di Palermo, 
vol. 18 (1904). 

t Heegaard, P. A translation into French is given in the Bulletin de la Société Mathématique 
de France, vol. 44 (1916). 

§ Dehn, M., Mathematische Annalen, vol. 69 (1910). 

|| Veblen, O., The Cambridge Colloquium, Analysis Situs, 2d edition, p. 155. 


88 


i 
4 


THREE-DIMENSIONAL MANIFOLDS 89 


We then define (Part III) a set of moves which operate on the curves and 
surface of a diagram and transform it into another. Two diagrams are called 
equivalent if one can be obtained from the other by a finite number of these 
moves. We then prove by a sequence of theorems (Part IV) in which we make 
use of a specially constructed canonical surface and diagram that any two 
representative diagrams of a manifold are equivalent. Several other theorems 
(Part V) lead up to the general theorem of equivalence to the effect that 
equivalent manifolds (equivalent in the sense of semi-linear analysis situs) 
arise from and give rise to equivalent Heegaard diagrams (equivalent in our 
sense) and vice versa. 

I wish to thank Professor J. W. Alexander for his many suggestions in the 
preparation of this paper. 


I. PRELIMINARY DEFINITIONS 


1. We will assume that the reader is familiar with the simplex, complex, 
manifold, incidence matrix, etc., as used in combinatorial analysis situs as, 
for example, in the book by Veblen, loc. cit. Throughout this paper, simplexes, 
cells, etc., will be at most 3-dimensional. 

2. We derive here several elementary properties of the simplex which we 
will need later. Let ¢; =A°A'!A2A? be a closed 3-simplex, (k =0 or 1) 
two opposite faces of (e.g., 7, =A°, =A'A*A'); let be the derived 
complex (i.e., a regular subdivision) of Every 3-simplex of oj contains a 
vertex of a; or of o2_;, but not of both. It follows at once that all the 3-sim- 
plexes of Gj fall into two groups, R; and Re, where R; contains all the sim- 
plexes incident with o; and R: contains all those incident with o2_,. Moreover, 
R, and R: have as boundaries B,+B and B.+B, respectively, where B, 
and B; consist of 2-simplexes of { on the boundary of @;, and the common 
portion B is a 2-cell whose boundary is a circuit on the boundary of ¢; which 
“separates” and o2_,. 

3. The canonical region, cell and curve. Given a 3-cell with its boundary 
sphere S2, we may decompose S; into a sum E; +£;’+B where Ej and Ey’ 
are two cells such that Ey and Ey’ do not meet and B is a spherical band.* 
We can think of this system as a handle with the £’s as its bases. 

We will now take a euclidean 3-sphere and attach » handles to it thus 
obtaining a 3-dimensional region R which we call canonical. Explicitly, we 
have R=E? +)>-?_, (Ei + Es ‘+i’ *) where E? is the spherical region and the 
elements in the sum represent the handles and their bases. The (point set)- 
boundary of R will be designated by L.f{ Incidentally, L is not to be construed 


* The bar over a symbol for a set denotes the closure of the set. 
+ We shall hereafter omit the words point set, understanding that whenever we speak of the 
boundary of a canonical region, we mean the (point set)-boundary. 


90 JAMES SINGER [January 


to lie in euclidean 3-space, for it may very well happen that Z is non-orient- 
able. 

We recognize in R the following property: if we sever each handle by a 
cross cut in the form of a 2-cell E;* , what is left is a 3-cell. In other words we 
can think of a canonical region as a region R within which there are p 2-cells 
E; , with boundaries e‘ on the boundary L of R (no two E’s intersecting) 
such that R—)~E; is a 3-cell. The cells E;' will be called canonical cells and 
their boundaries canonical curves. 

4. The canonical surface and the Heegaard diagram. A surface L is said 
to be a canonical surface of a manifold M if it satisfies these conditions: 

(a) Lisasubcomplex of M and isa closed, connected 2-dimensional mani- 
fold; 

(b) M can be decomposed into R:+Z+Re2, where R; and R; are canonical 
regions with the common boundary L. 

We note four properties of the canonical surface and regions which follow 
directly from their definitions: 

A. If M’ is a subdivision of M, and L’, R{ and Ri the induced subdivi- 
sions of L, R: and Re, then L’ isa canonical surface of M’ dividing it into the 
canonical regions R{ and R;. We will also call L’ a canonical surface of M. 

B. The number of 2-cells that must be removed from R; to reduce it to 
a 3-cell is the same as the number that must be removed from Re, and each 
is precisely the maximum number of non-intersecting circuits that can be 
drawn on L without disconnecting it (Heegaard). 

C. R, and R; are homeomorphic, since they have the same boundary. 

D. Land M are both orientable or non-orientable (Heegaard). 

5. A problem as yet unsolved is the determination of the minimum genus 
(or connectivity) of all the canonical surfaces of a given manifold M. This 
number is clearly a topological invariant. A simpler question is to ask: under 
what conditions can the genus of a canonical surface of a manifold be lowered 
or raised? Lemmas 1 and 2 below do not state the most general circumstances 
but they are sufficient for our needs. 

6. In this paragraph we shall use the following notation: Let R be a 
canonical region, L its boundary; E;, Ej and €; 3-cells with boundary spheres 
So, Sf and Ss. Let S2 be separated by a circuit into the two 2-cells E, and 
€;, and let E;[Ej | be separated by two non-intersecting circuits into two 
2-cells EZ}, E? [Es1, Ef? | and a band B [B’]. Finally, let B [B’] be separated 
into two 2-cells [EJ™, ] by two arcs, each running from one of 
its bounding circuits to the other. 


Lema la. Jf €3 and R+L have in common only E: on their boundaries, 
then R+ €;+ €, is a canonical region whose boundary is L+ Ef — Ex. 


} 
4 
| 


1933] THREE-DIMENSIONAL MANIFOLDS 91 


Lemma 2a. If €; is a subcomplex of R and if S2and L have € in common, 
then R— €;— €, is a canonical region whose boundary is L+ €,— Ez. 


These lemmas need no proofs. Obviously, the genus of L is neither raised 
nor lowered by the operations of the lemmas. 

We can, however, change the genus of L by removing or adding a handle, 
E;. 

Lemma 1b. If E; is a subcomplex of R and if S, and L have in common B, 
then R—(E;+E} +E?) is a canonical region whose boundary is L+(E} +E? 
—B). 

Lemma 2b. Jf E; and R+L have in common only E} and E? on their 
boundaries, then R+(E;+E? +E?) is a canonical region whose boundary is 
L+(B-—E} —E?). 

These lemmas, too, need no proof. In the first case, the genus of L is 
lowered by the removal of a handle, in the second, raised by the addition of a 
handle. 

We can lower the genus of L by attaching to it a 3-cell Ej along a band 
and we can raise the genus by removing such a 3-cell, i.e. by “boring” a hole 
through the region. However, we can add or remove a 3-cell only under cer- 
tain conditions which are stated in the lemmas below. 

Lemma Ic. Let Ej and R+L have no points in common and Si and L have 
B’ in common, and let E; be as in Lemma 1b; then, if S2 and Si have only 
Ex! =E{*! in common, R+Ej +B’ is a canonical region whose boundary is 
L+ (Ez) + Ez? —B’). 

For since, by Lemma 1b, R—(£;+ £7 +?) is a canonical region whose 
boundary has the closure of the 2-cell Z| +E? + £;** in common with the 
boundary of the 3-cell E;+ EZ; +£,*', it follows from Lemma 1a that 


[R — (Es + Ed + Ef) | + [Es + Ei + 
+ [Ey + E? + Ex*] = R+ Ej + B 


is a canonical region. It is clear that the boundary of R+ Ej +B’ is L+E;! 
—B’. 

Lemna 2c. If Ej is a subcomplex of R such that S{ and L have Ez' and 
Ez? in common, and if there exists a 3-cell Es, subcomplex of R, such that 
Ef'=Ez*! and Ef? is a 2-cell on L, then R—Ej —B’ is a canonical region 
whose boundary is L—Ej!—Ej?+B’. 

For, since E;+ Ej +£Ez*' is a 3-cell, it follows from Lemma 2a that 
R-(E;+ Ej + Ei*') —(£ +E? +E/**) is a canonical region. Hence, by 


92 JAMES SINGER [January 


Lemma 2b, R—(E;+ Ej + — (Ei +E? + E:**)+E;+E} +E? =R-—E; 
— B’ is a canonical region, whose boundary, as can be readily seen, is L — Ey! 
+ B’. 

7. Let the canonical surface L of a manifold M divide it into the two 
regions R,; and R2; let E and F be canonical 2-cells of R; and Re, respectively, 
e and f, their boundaries. Let E; be a 3-cell as in Lemma 1b, where we put 
E=E} and R=R,. Let A be a 1-cell interior to R; with end points on L. 
Let Ej be a 3-cell as in Lemma 2c which is a neighborhood of A in R,. 
The 2-cells E{! and Ez? will be neighborhoods on L of the end points of A. 
We now have 


Lemma |. If the canonical curves e and f meet once and only once, then 
L+(E}+£E?—-—B) is a canonical surface dividing M into the two canonical 
regions R,—(E;+E} and R.+(E;+B). 

Lemna 2. If there exists a 1-cell A’ on L such that A+A’ bounds a 2-cell 
of Ri, then L—(Ej!}+Ei* —B’) is a canonical surface dividing M into the 
canonical regions R,— (Ej +B’) and R.+ (Ej 


The proofs of the two lemmas follow at once from Lemmas 1 abc, 2 abc. 
We note that the effect of the first lemma is to remove a handle from R, (as 
in Lemma 1b) and to add a 3-cell to Rz (as in Lemma 1c) and the effect of the 
second lemma is to remove a 3-cell from R, (as in Lemma 2c) and to add a 
handle to Rz (as in Lemma 2b). The genus of the canonical surface is lowered 


in the first lemma, raised in the second. 

8. Construction of the Heegaard diagram. The utility of the canonical sur- 
face and curves lies in the fact that they give us an adequate representation 
of the manifold. Indeed, let L be a canonical surface of a manifold M, and let 
e1,--+, ft,-+-+, f? be two sets of canonical curves, boundaries of 
canonical sets of 2-cells of R,; and Re, respectively. Then, having L, e!,-- -, 
e?, f', - - - ,f?, we can dispense with the rest of M entirely, for any information 
that can be derived from M can be derived from them. To reconstruct a 3- 
dimensional manifold, we attach 2-cells to each of the canonical curves, and 
then add two 3-cells in the obvious way. We thus obtain a manifold N which 
in general will not be identical to M, but which must always be homeomor- 
phic to it because of the construction. 

Of great use in the study of manifolds is the fact that a model of the 
canonical surface and curves of any manifold can be constructed in ordinary 
spherical 3-space, S;. If the canonical surface were orientable, we could im- 
merse it in S; immediately; this is impossible if it is non-orientable. To treat 
both cases at the same time, we adopt one of the normal forms for a 2- 
dimensional manifold which can be immersed in 5S, i.e., the plane (plus a 


4 
4 


1933] THREE-DIMENSIONAL MANIFOLDS 93 


point at infinity) from which the interiors of 2 circles have been removed.* 
We call the surface A, its 2p-bounding circles, ¢*, e*, i=1,---, p. For the 
sake of definiteness later on, let us assume that the circles are of equal radii 
with centers equally spaced along a straight line. 

We now establish a continuous correspondence between the points of L 
and A which is (1, 1) everywhere except that a point on ¢e has for image a 
point on ¢* and a point on «*. Thus, the image of an e curve will be a pair of 
the « circles on A. If a particular f* does not meet any of the e curves, then its 
image on A will be a circuit; if it does meet the e’s, then its image on A will 
consist of a set of arcs, each joining two points on the circles. We call the 
circuit or aggregate of arcs corresponding to f*, ¢*. 

From what has been said above, it is clear that A, e*, ef, and ¢* (¢=1, 

- ++, p) also serve as an adequate representation of the manifold M. We 
call this representation a Heegaard diagram of M. 

For some purposes it is convenient to have a Heegaard diagram in which 
the f’s are mapped on the pairs of circles, and the e’s become aggregates of 
arcs. In such cases, we shall use another plane A’, and introduce notation as 
needed. 

To reconstruct a manifold N from a Heegaard diagram A, we first sub- 
divide A, if necessary, and then construct a closed 2-dimensional manifold L 
equivalent to A where, however, a single circuit e* corresponds to the pair of 
circles ¢* and ¢#. We must take care that e* and ¢# are matched upon e‘ with 
the proper orientations. We then proceed as before, successively adding the 
2-cells and finally the two 3-cells. 

If a Heegaard diagram is constructed from a manifold, we shall say that 
the manifold gives rise to the diagram and that the diagram arises from the 
manifold; similarly we shall say that a diagram gives rise to a manifold and 
that the manifold arises from the diagram when the manifold is constructed 
from the diagram. 

9. Since a Heegaard diagram is an adequate representation of its mani- 
fold, all invariants of the latter should be obtainable from the former. We 
show how to get the Poincaré group, homology characters, and some inter- 
section invariants in the next section. 

Since any one manifold can give rise to a great variety of diagrams, 
we do not seem to be any nearer the solution of the problem of the classifica- 
tion of manifolds. We reserve all such questions for Parts III, IV, and V; at 
the present we merely note that the form of a Heegaard diagram arising from 
a manifold is by no means unique. 


* See Alexander, J. W., Normal forms for one- and two-sided surfaces, Annals of Mathematics, 
(2), vol. 16, No. 4, June, 1915. 


94 JAMES SINGER [January 


II. THE KNOWN INVARIANTS 


1. The Poincaré group. Let A be a Heegaard diagram of a manifold M, 
where A consists of A, ¢*, «* and ¢‘,i=1, 2, - - - , p, all defined as in Part I. 

It is well known that there exist, on A, 2 circuits, a; and b;,i=1,---, p, 
all passing through a fixed point O but having no other points in common and 
such that every other closed curve of A is deformable into a sum of the a’s 
and b’s. The a’s and 6’s can then be taken as the generators of the Poincaré 
group of A. Moreover, we can always choose the curves in such a manner 
that a; is isotopic to «* on A and 6, consists of two arcs joining O to congruent 
points of e* and e#. We can choose a similar base on A’ (another representa- 
tion of A, in which a pair of circles ¢;° and ¢,* represents a canonical curve f*), 
namely c; and d;,i=1, - - - , p, where c; is isotopic to ¢* on A’ and d; con- 
sists of two arcs joining O’ (image of O) to congruent points of ¢f and ¢p*. 

Since the surfaces A and A’ are representations of the same surface L of M, 
we Can express every curve of one base as a product of the generators of the 


other base, i.e., c, = a;:b,a;'b;\ - - - , where i; is one of the integers 1,2,---, 
p and €;is 0, 1 or —1. Symbolically we can write 

(1) c;=fab, d; = 

(1’) II Fed, b; II ‘cd. 


The only identity relation among the generators is 

in case A is orientable, or 


in case A is non-orientable.* There are, of course, similar expressions in terms 
of the c’s and d’s. 

Let us now adjoin to A the pairs of congruent 2-cells EZ, and £;, interiors 
of the circles and where and correspond to the same canonical 
2-cell E‘ of R;. To note what modification the group of A undergoes, we shall 
make use of a theorem concerning the Poincaré group of an arbitrary 2- 
dimensional complex K2: If the Poincaré group of Kz is generated by the 
elements 

81, 825° 8p 
with the identity relations 


€; €2 €% 


Ti = £ij = 1 (i = 1, 2, q) 


* The identity relation in the non-orientable case is not one of the usual forms; it is, however, 
equivalent to them in the sense that we can so choose the generators of the Poincaré group that (2’) 
holds. 


1933] THREE-DIMENSIONAL MANIFOLDS 95 


where g;, is either a generating element or its inverse and we adjoin to K2a 
2-cell E having no points in common with K2 but whose boundary is /, where 
his a circuit of Ke, then the group of K2+Z is the same as the group of K2 
plus the additional relation 
r= = 1, 

where r represents a curve isotopic to h on K2.* In particular, if h is isotopic to 
a generator, say g:, of the group of Ke, the group of K.+£ can be obtained by 
replacing gi by 1 wherever it occurs in the identity relations of the group of 

It follows at once that the group of A+)-?_, (E,i+£;‘) is generated by 
the p elements 


(3) bi, bs, +--+, bp 
with no generating relations, since (2) or (2’) reduces to the unit element iden- 


tically. The products (1) and (1’) take on new forms which we write sym- 
bolically as 


(4) = 11/18, d; = 
(4’) 0b; = 


We now add the 2-cells F‘ to A+), (Ei obtaining the 


2-dimensional complex L +) E'+)>-F‘. The products (4) and (4’) take on the 
new forms 


(5) 1=1/"b, 
(5’) 1=1/'%d, 6; = 


From what has just been said, it is clear that the group of A+) E‘+)-F' is 
generated by the p elements (3) connected by the # relations II/’! b=1 of 
(5). We obtain an equivalent Poincaré group if we take for basis the p 
elements 


(3’) C1, C2,°* * Cp 


connected by the # relations II/’* d=1 of (5’). 
The addition of the remainder of the manifold M (that is, the two 3-cells) 
can have no effect on the Poincaré group since a curve homotopic to a point 


* If r’ is another product of the g’s also representing a curve isotopic to k on Ke, then the group 
obtained by adding r’ is equivalent to the group obtained by adding r, since 1=r’=srs~!, where 
s is a product of g’s. 


96 JAMES SINGER [January 


over M is first deformable over M on to L+)_E‘+)_F‘ and is then homotopic 
to a point over L+)_ Fi. 
Hence the Poincaré group G of M is given by 


generators 


identity relations 


1 


where the identity relations are the products II/’! of (5) written out in full. 
There is an equivalent group in terms of the d’s, but we shall not write it out. 

2. The homology characters. Associated with G ™ is the group G “e which 
is obtained by making all the elements commutative. G ™< is given by 


bi, generators 


a 


ax 


by 


2 
a? 


= identity relations 


P P 


by” 
where af ars. 

Every invariant of G ™“< is an invariant of M, the manifold giving rise to 
the Heegaard diagram A. In particular, Gc yields the 1-dimensional Betti 
and torsion numbers. 

If M is orientable, its Betti numbers are 1, p—p, p—p, 1; if M is non- 
orientable, its Betti numbers are 1, p—p, p—p—1, 0; where p is the rank of 
the matrix ||«;‘||. If M is orientable, it has a set of 1-dimensional coefficients 
of torsion given by the invariant factors of ||a;||; if M is non-orientable, it 
has in addition to these a 2-dimensional coefficient of torsion equal to two. 

3. The intersection numbers. We now pass to the intersection invariants 
of the manifold M. There are two types of such invariants, the Kronecker 
indices of the 1- and 2-cycles and the intersection 1-cycles of 2-cycles with one 
another. To find these invariants we need to construct a homology base for 
the 1- and 2-cycles. 

Let us consider the boundary relations 


(6) Fi— ¢' (on M) 


at al, ai ai 
+++ by by ++ =1 
a? ai ah, 
aP aP aP aP aP 
1 1 
i 
1 
4 


1933] THREE-DIMENSIONAL MANIFOLDS 


and the homologies 
(7) ~ (on L), 
i 


where the matrix ||a;|| is the same as the matrix of exponents in the identity 
relations of Gc. It is well known that there exist unimodular transforma- 
tions wf and such that the bounding 
relations (6) and the homologies (7) become 


(8) F's — (on M) 


and 
(9) ~ Ya} (on L) 


respectively, where ||a’‘|| is a diagonal matrix. The first p—7 terms of the 
main diagonal of ||«/*|| are equal to 1, the next 7 terms are the invariant factors 
(1-dimensional coefficients of torsion) and all the other terms are zero. 

Since every 1-cycle of M is homologous to a linear combination of the 0’s, 
and hence to the b’’s, and since the 1-dimensional Betti number of M is 
p—p, it follows that the curves 


(10) bY (¢=p+1,p+2,---, p) 


form a basis for the non-bounding 1-cycles of M in this sense: if } is any 
1-cycle of M, then 


P 
~~ a,b’* + a a,b". 
t=—p+1 t=p—t+1 
But since some multiple of b’* (p—7+1<i<p) bounds, it will meet any 
2-cycle of M zero times algebraically. Hence as far as the intersection of 6 with 
a 2-cycle is concerned, only the terms of the first sum in } need to be con- 
sidered. 
It can be shown that the 2-dimensional complexes 


(11) (¢=p+1,p+2,---, p) 


can serve as a base for the non-bounding 2-cycles as far as intersections are 
concerned. To be sure, none of the F’s is a cycle, nor is any combination of 
them a cycle since every sum )-\,F’+ has for boundary >-A.¢’¢, but it can be 
shown that if the complexes Ff, i=p+1,---, p, do form a base for the 
non-bounding 2-cycles of M in the same sense as the 5‘’s just above, then the 
intersection of b‘ and Fi, on M is the same as the intersection of b‘ and the 


97 


98 JAMES SINGER [January 


boundary of F/ on L. Hence the intersection matrix of the non-bounding 1- 
and 2-cycles of M is given by 


(12) (i,j =p+1,---,p) 


where the element in the ith row and jth column is the Kronecker index of 
b’‘ and ¢’/ on A. In case M is orientable, the matrix (12) is unimodular and 
can be transformed into a diagonal matrix by further unimodular trans- 
formations of the b’’s and F’’s. 

The 2-complexes (11) also yield the intersections of the non-bounding 
2-cycles among themselves. Indeed, we can compute from them a cubic 
matrix 


(13) (i,j =1,2,---, p), 


where we can show that >>,0;;,b* is a cycle homologous to the intersection 
cycle of two non-bounding 2-cycles of M, as defined, for example, by Lef- 
schetz in his Colloquium Lectures, Chapter IV. The o’s are precisely the 
same as those defined by Alexander in Proceedings of the National Academy 
of Sciences, 1924, p. 99. 

They are obtained in this fashion. If i=7, then o;;,=0 for all k’s. If i¥7, 
we proceed as follows. 

We consider a particular pair of (11), say F’! and F”, and a particular 
canonical circle, say «:°. Since F’! can be considered a two-cycle, its boundary 
on A, namely ¢’!, meets «° zero times algebraically, if we count each inter- 
section with its proper orientation and multiplicity. Now imagine that the 
curve ¢,° is shrunk to its center A. The point A is met zero times algebraically 
by the segments of #’! abutting on it. 

Similarly, the point A will be met zero times, algebraically, by the seg- 
ments of ¢’? incident with it. 

If no segment is common to ¢’! and ¢’2, we define the number oj3 as the 
Kronecker index of ¢’! and ¢”? at the point A, where we take into account 
the multiplicity of the various branches. 

Suppose now that the segment a belongs to ¢’! with the multiplicity m 
and to ¢’? with the multiplicity m2. Replace a by two distinct segments a 
and a? lying very close to it, and let a' belong to ¢’! with the multiplicity m 
and a? to ¢’? with the multiplicity me. If we do this for all segments common 
to ¢’! and ¢”? we are led to the former case and can define oy; as there. How- 
ever, it is apparent that o12; as so defined is not unique, for the Kronecker 
index will depend on the method of replacing a by a! and a’, that is, on 
whether we take a! to the right or left of a*. We therefore define as the oi23 


| 
iif 
ye 


1933] THREE-DIMENSIONAL MANIFOLDS 99 


the number of smallest numerical value so obtained. As a matter of fact, the 
difference between any two values represents a cycle which is homologous to 
zero; we choose the smallest numerical value purely for convenience. 

We can now define the o’s for all 7, 7, and & in the same way. It is clear that 
we need not consider the canonical circles é:*, since the intersections are the 
same as on 


III. DEFINITIONS OF THE MOVES AND EQUIVALENT HEEGAARD DIAGRAMS 


1. Two Heegaard diagrams A and A’ with canonical curves ¢', -- -, 
e?, d',---, and respectively, are said to be 
identical if 

(a) p=q, and 

(b) Aand A’ are mapped on the same euclidean plane A in such a way that 
each pair of circles e/* and e/* coincides in position and orientation with a 
pair and and each ¢* is isotopic to a ¢/ on A. 

2. We now define a set of transformations, called moves, which operate 
on and modify the curves and surface of a Heegaard diagram. The moves will 
fall into three classes or types; the first will modify the canonical curves, the 
second will modify the plane A by rotating a portion of it through a multiple 
of x, the third will modify the canonical curves and the plane A by the 
addition (or subtraction) of canonical curves. 

Type I. A. A move of Type IA simply changes the orientation of a given 
canonical curve. 

Let g be a 1-sphere on A not meeting any of the circles (images of the 
canonical e curves) and containing in its interior at least one circle, say 
e*, and at most two such circles. In the latter case, the two circles must be 
adjacent but not the two circles of a pair, i.e., images of the same e. Sever 
A along g and call the inner and outer lips of the cut g and g’, respectively. 
Remove the interior of g from A and attach it once more, this time, however, 
by matching corresponding points of ¢* and e#. We have lost the canonical 
pair e* and ef and have gained a new pair of circuits, g and g’. The two cases 
give us the moves of Types IB and IC, i.e., 

B. g contains only one circle, 

C. g contains two circles. 

A move of Type IA, B or C is its own inverse. The effect of a move of 
Type IB is to replace a canonical curve by one isotopic to it. The effect of a 
move of Type IC is to replace a canonical curve by the “sum” of it and 
another canonical curve. 

D. A move of Type ID is the replacement of the image on A of a canonical 
f curve, say ¢‘, by the sum of the images of f‘ and some f, i.e., by ‘+9’. 


100 JAMES SINGER [January 


The move is effected by deforming ¢‘ on A until it has a 1-simplex in common 
with ¢/. Then by removing this 1-simplex we obtain a circuit ¢/'=¢‘+¢/, 
which after a slight deformation will have no points in common with ¢/. 
The replacement of ¢‘ by $’‘ is the move of Type ID. 

When a diagram has been modified by a move of Type IB or C, its circles 
will no longer be in standard form, i.e., of equal radii and equally spaced 
along a straight line. However, an isotopic deformation of A in S; will bring 
them into standard form. We shall always suppose this done when we operate 
on A by a move of Type IB or C. 

Type II. Let g be a circuit on A having no points in common with the 
canonical set {e} and let g’ be its position after a small isotopic deformation 
such that g and g’ have no points in common, i.e., they have the appearance 
of two concentric circles, where g’ is interior to g, say. Let B be the band- 
shaped region bounded by g and g’. Now rotate g’ and that part of A interior 
to it through a positive angle of m or 27, keeping g and that part of A exterior 
to it fixed. We note particularly that if one of a pair of canonical circles lies 
within g’ and the other without, only one of them is rotated. The result is a 
distortion in the band B; however, if B is suitably subdivided into simplexes 
of fine enough mesh, its structure will remain unaltered. 

However, we do not wish to employ this type of transformation in its most 
general form but only in two certain cases when the region interior to g’ con- 
tains either one or two canonical curves. 

A. Rotation through a positive angle 7 when the 1-sphere g’ contains two 
consecutive circles. 

B. Rotation through a positive angle 27 when the 1-sphere g’ contains 
only a single circle. 

Inverses of moves of Types ITA and B will be rotations through negative 
angles. 

Type III. A move of this type adds a pair of canonical circles and a new 
canonical curve to the plane A. It is effected as follows: Let «7+! ande?*! 
be a pair of circuits on A each bounding a 2-cell, and let ¢?+! be an arc joining 
the point P; of «7+! and the point P2 of ¢”+!. The arc ¢?+! must have no other 
points in common with «+! or «?+! and no points in common with the two 
2-cells bounded by «?+! and «&+!. Also 6+! and must not meet 
any of the other canonical curves or arcs. Then if we remove the interiors of 
«+! and «&?+! from A and identify the points of ¢”+! and «+! so that P; and 
P, are matched, we can add the pair «+! and «+! to our canonical circles 
and $+! to our canonical arcs. Its inverse, the removal of a pair of canonical 
curves e?+! and $?+! under the conditions just described, will be denoted_by 
Til’. 


1933] THREE-DIMENSIONAL MANIFOLDS 101 


The surface A seems to play a preferred réle in the description and defini- 
tion of these moves. However, all our moves are also applicable to the surface 
A’, where the ¢ curves are represented by pairs of circles. It is not necessary 
to note the effect on A’ of one of the moves operating on A; we only note that 
when we modify A by a move of Type IC, A’ is modified by a move of Type 
ID, and vice versa. 

3. Let now A bea Heegaard diagram with the two canonical sets of curves 
{e} and {¢}. We state six lemmas whose proofs follow immediately from the 
definitions of the moves. 


Lemma 3. If g is a circuit on A not meeting any of the canonical circles and 
containing in its interior «* and any number of other circles but not €:*, then 
the result of severing A along g and reattaching the piece so removed to A along 
e* and ef may be obtained by successively applying moves of Type IC. 


Lemma 4. If the set of curves { e’} is derived from the canonical set {€} by a 
finite number of moves of Type I, then the set { e’} is canonical. 


Lemna 5. If the canonical set { e’} is derived from the canonical set {€} by a 
finite number of moves of Type I, then { €} can be derived from { e’} by the same 
number of moves. 


Lema 6. If the surface A and the canonical sets of curves {€} and {¢} of a 
Heegaard diagram A are transformed into a surface A’ with the sets of curves { ¢'} 


and {¢'} by a finite number of moves, then A’, { e’} and {¢'} form a Heegaard 
diagram. 


Lemna 7. If the Heegaard diagram A’ is derived from the Heegaard diagram 
A by a finite number of moves, then A can be derived from A’ also by a finite num- 
ber of moves. 


Lemna 8. If the Heegaard diagram A gives rise to the manifolds N and N’, 
then N and N’ are homeomor phic. 


Coro tary. If the manifold M gives rise to the diagram A, which in turn 
gives rise to the manifold N, then M and N are homeomor phic. 


The proofs of this lemma and corollary follow at once from the construc- 
tion of N and N’ as outlined in §8, Part I. 

4. Weare now in a position to give a formal definition of equivalence: two 
Heegaard diagrams are said to be equivalent if one can be transformed into a 
diagram identical to the other by a finite number of moves of Types I, II and 
III. 

5. We prove the following theorem: 


102 JAMES SINGER [January 


THEOREM 1. Jf the equivalent Heegaard diagrams A and A’ give rise to the 
manifolds N and N’, then N and N’ are homeomor phic. 


Since A and A’ are equivalent, A can be obtained from A’ by a finite se- 
quence of moves of Types I, II, and III which transform A’ successively into 
the diagrams A!, A*,- - - , A"™=A. It is therefore necessary and sufficient to 
prove that at each stage the transformed diagram A**t! gives rise to a manifold 
N**! homeomorphic to the one, NV“, arising from the preceding diagram A‘. 

This is obviously true when we modify A* by a move of Type IA, IB, ora 
move of Type II, hence we need only prove that the theorem holds when we 
employ a move of Type IC, ID, or a move of Type III. 

Let, then, A* be transformed into A*+! by a move of Type IC and let 
A* and A*+! give rise to the manifolds N* and N*+!. We can suppose that the 
_ move replaces the canonical curve of A* by the canonical curve 
=e} Let us retain the circuit on the diagram A*+! and slightly de- 
form it so that it does not meet any of the canonical curves of A*+!. We can 
then find in the manifold N**! a 2-cell EZ? lying in the region containing the 
canonical 2-cells E,, and whose boundary is e?. The 2-cell EZ? will divide 
this region, a 3-cell, into two parts. If we unite these two parts along EA,1 
and consider E? as part of the boundary, we have again a 3-cell. We have 
not changed NV*+! at all, but obviously from this point of view it may be con- 
sidered a manifold arising from A*; hence by Lemma 8, N*+! is homeomorphic 
to N*. 

A similar argument holds when A* is transformed by a move of Type ID. 

Suppose now that A* is transformed into A*+! by a move of Type III 
which adds the canonical curves e?+! and $+! to the canonical curves of A*. 
In the manifold N*+! the curves e?+! and @?+! are such that Lemma 1 is 
applicable where the 2-cell E+! plays the réle of E,! of the lemma. If we make 
the necessary modifications, we transform the canonical surface and regions 
of N** into new ones. But with these latter canonical surface and regions 
N**! may be considered as a manifold arising from A*; hence, once more V* 
and N*+! are homeomorphic and the theorem is proved. 

6. If, then, the Heegaard diagrams A and A’ are equivalent, i.e., if one 
can be transformed into the other by means of the moves, then any two 
manifolds N and N’ that arise from them are homeomorphic. To give greater 
justification to the definition and notion of equivalence, we must prove con- 
versely that if two manifolds M and M’ are homeomorphic and they give rise 
to the Heegaard diagrams A and A’ then the two diagrams are equivalent. 
In other words, we must prove that the moves are indeed sufficient to trans- 
form A into A’. 


4 
6 
| 
ATS 


THREE-DIMENSIONAL MANIFOLDS 


IV. HEEGAARD DIAGRAMS ARISING FROM A MANIFOLD 


1. Variations in a Heegaard diagram may arise in 

(a) the choice of the canonical surface, 

(b) the choice of the canonical 2-cells of one of the canonical regions, 

(c) the choice of the canonical 2-cells of the other canonical region, 

(d) the method of immersion in 53, i.e., the method of mapping L on A. 
In this section we study the relationship between any two diagrams arising 
from a manifold and prove (Theorems 2-8) that any two such diagrams are 
equivalent. 

We shall use consistently the following notation: M, M’, N, etc., shall 
denote a 3-dimensional manifold; Z, a canonical surface, Ri and R, the two 
canonical regions. Canonical sets of 2-cells of Ri shall be denoted by {E}, 
{ E’}, etc., of Re by {F}, {F’}, etc. The canonical sets of curves (boundaries 
of the canonical 2-cells) will be denoted by {e}, {f}, etc. We shall denote a 
Heegaard diagram by A, A’, etc., and we shall use the notation A=(A, e, ¢) 
to signify that the Heegaard diagram A consists of the plane A (image of a 
canonical surface L), the pairs of circles, «*, «* (images of the canonical e 
curves), and the arcs ¢# (images of the canonical f curves). 

2. In Theorem 2 below we prove that two diagrams arising from the same 
canonical surface and curves are equivalent, i.e. two methods of immersing L, 
{e} and {f} in S; (see (d) of §1) yield equivalent diagrams. 


TueoreM 2. Let L be a canonical surface of a manifold M, {e} and {f} 
canonical sets of curves of the two regions; then if L, {e} and {f} give rise to the 
Heegaard diagrams A=(A, €, ¢) and A’=(A’, €’, 6’), A and A’ are equivalent. 


In this theorem A and A’ ,e‘ and e’‘, ¢‘ and ¢’‘ correspond to the same L, 
e‘ and f‘, respectively, of M. Let us superimpose the two planes A and A’; 
we can assume, without any loss of generality, that the canonical circles coin- 
cide. We choose our notation so that e*, ef and e/*, e* (¢* and @’*) are 
images of the same e* (f*) of L. 

The circles e*, ef and e/*, ez* will not coincide, in general, but we shall 
show how to modify A’ by means of our moves so that the two images of each 
e do coincide and the two images of each f are isotopic. 

We regard the two planes A and A’ as infinitely close but distinct, as, for 
example, two sheets of a Riemann surface. There will be no misunderstanding 
then when we speak of the intersection of a curve on A with a curve on A’, 
and at the same time insist that when we modify A’ by a move, no change 
occurs on A. 

Let us denote by i=1, 2,--+-, the segments of the 
straight line in A through the centers of the canonical circles. The segment 


1933] 103 


104 JAMES SINGER [January 


d* then joins the &th and the (k+1)st circles. By subdivision of A we can 
make these segments 1-cells of A. Let \’#,7=1, 2, - - - , 2p—1, be the images 
of \‘ in A’. Then if \! joins to X’! joins to 

Since (ef is a 2-cell, to prove A is equivalent to 
A’ it is sufficient to prove that we can modify A’ by our moves so that e# 
and ¢ coincide with e/* and e7*, respectively, and \* is isotopic to \’*. But 
this is quite obviously possible by means of moves ITA and IIB. Indeed, let ¢; 
be the first circle on A (reading from left to right, say). The 1-cell A! joins €? 
to e*, say. By a finite number of moves of Type IIA (acting on e/! of A’) 
we can make e/! coincide with ¢,’. Then by another finite sequence of moves 
we can make e’* coincide with e*. Moreover, we can choose this sequence of 
moves in such a fashion that \’! loses all its intersections with all \‘, 71. 
Then by a finite number of moves of Type IIB acting on e’*, we can make \’! 
lose all its intersections with \'. The 1-cells \' and \’! are now isotopic. By 
continuing this process, we can make each \* isotopic to its corresponding 
d’* and each e* (€#) coincident with e/* (e/*). The theorem is therefore 
proved. 


Coro.itary 1. Let A=(A, €, d) be a given Heegaard diagram, and let A 
= (A, ¢, €) be a diagram obtained from A by mapping the arcs $‘ of A as pairs of 
circles on \ and the pairs of circles €* and e§ of A on arcs o' of A. Then if A’ 
=(A’, o’, is another diagram similarly obtained, and X’ are equivalent.* 

Coro.iary 2. Let the diagram A be equivalent to the diagram A’; then if the 
diagrams X and X’ are obtained from A and A’, respectively, as above, they are 
equivalent. 


The proofs of these corollaries follow directly from Theorem 2. We only 
need to point out that a move of Type IC acting on A is equivalent to a move 
of Type ID acting on A. 

3. Theorem 3 below states that two Heegaard diagrams arising from two 
choices of the canonical 2-cells of R; (see (b) §1) are equivalent. 


TueoreM 3. If {e} and {e’} are two sets of canonical curves for the same 
region R, of M, then any diagram A=(A, ¢, ¢) arising from L, {e} and {f} is 
equivalent to any diagram A’ =(A’, e’, @’) arising from L, {e} and {f}. 

In this theorem, A and A’, ¢‘ and ¢’‘ correspond to the same L and f', 
respectively, of M, but ¢ corresponds to an unprimed e whereas e’ corresponds 
to a primed e. 

Let e’’‘ be the image of e‘ on A’. We will prove the theorem by showing 
how to modify A’ so that the e’’s are replaced, one by one, by the e’”’s. The 


* From here on, A, etc., will indicate not the closure of A, etc., but a new A, etc. 


} 


1933] THREE-DIMENSIONAL MANIFOLDS 105 


diagram A’ will be then transformed into the diagram A’’. By definition, A’ 
and A”’ are equivalent, by Theorem 2, A and A”’ are equivalent, hence A and 
A’ are equivalent. The process by which we modify A’ will be broken up into 
three cases. 

Case 1. We suppose first that no ¢’’ meets an e/ or an e7 and that no e”’ 
is interior to another. 

The e’’ curves will then be circuits on A’ not meeting any of the circles, 
e{* or e¢*. Each e’’ must contain at least one e’ circle; for any i, either 
e{* or ef or both are contained in an e’’; and there is at least one e’ circle not 
contained in any e’’. 

Let e/* be a circle not contained in any e’’. Its partner ¢/* must be con- 
tained in at least one e’’. Several subcases now arise. 

Suppose first only one e’’ circle, say e’’*, contains e/* and that e’’* con- 
tains no other e’. Then by a move of Type IB, in which we sever A’ along 
e’’* and heal it up again by matching e{* and «/*, we transform one of the 
e”’s into one of the ¢’”’s. 

Suppose secondly that only one e’’ circle, say e’’*, contains e{*, but that 
e’’> contains other e’ circles. If «’’* contains only one other such circle, say 
e/”, then by a move of Type IC by which we sever A’ along ¢’”* and heal it up 
again by matching e/* and «/*, we again transform one of the e’’s into one 
of the e’”’s. The circle e’” is now not contained in any e’’ curve. 

If e’’® contains more than one circle besides e/* , we modify A’ by a move 
of Type IC by which we sever A’ along a circuit not meeting e’’* and contain- 
ing e7* and one other circle and then patch it up by matching e/* and «*. 
We thus obtain a new configuration in which there are one fewer circles in- 
terior to ¢’’*, One of the new canonical circles is interior and the other is 
exterior to ¢’’. By repeating this process we can remove the circles interior 
to e’’* one by one until only two remain, and then we are back to the preced- 
ing case. 

By a finite number of steps we can modify A’ so that all the e’’s become 
e’”’s. Hence, A’ is transformed into a diagram A’’ representing L, {e} and 
{f}, and is therefore equivalent to A by Theorem 2, as we wished to prove. 

Case 2. Let no e’’ meet an e’ circle, but suppose that some e’’’s are in- 
terior to others. 

Suppose first that the e’”s form “nests” of circuits, i.e. if e’’* contains 
e’’> and e’’¢, then either e’’* contains e’’¢ or e’’* contains e’’*. By considera- 
tions entirely analogous to those of Case 1 we can modify A’ so that the 
innermost e’’ of a nest is changed into an ¢’ and hence the nest has one fewer 
e’’ curves. After a finite number of steps we are led back to Case 1. 

Now suppose that the e’”’s are contained in other e’”’s in a general fashion. 


106 JAMES SINGER [January 


Several cases arise, but it is not necessary to go into details. We can always 
transform A’ so that the e’”’s are transformed into nests of curves and we are 
back to the former case. 

Case 3. We suppose that the e’’’s do meet the e’’s. 

Let us examine, first of all, the nature of the intersection of an unprimed 
canonical 2-cell, say E’, with a primed canonical 2-cell, say E’*, in the mani- 
fold M. The intersection will consist of a number of 1-cells and circuits. By a 
slight deformation of Z’* and its boundary e’* we can arrange so that e’* 
meets e’, the boundary of £’, only in a finite number of points, such that the 
end points of any 1-cell common to E/ and E’* are distinct. 

Keep j fixed and let & run from 1 to p. We obtain on e’ a certain number of 
points, grouped into pairs, where each pair is the boundary of a 1-cell common 
to Ei and some E’ and no pair “separates” another on e’. Hence there is at 
least one pair, say P; and P2, such that one of the two arcs into which P; and 
P, divide e’ contains no other intersection point. Let us call this arc a. The 
points P; and P2 are on some e’, say e’*. 

We now return to the Heegaard diagram A’ on A’. The canonical curve 
e’* is represented by the pair of circles ¢/* and e¢* ; each circle has on it the 
images of P,; and P2, which we continue to call P, and P2, and a is now mapped 
on a 1-cell a of A’ joining P; and P; of ef* , say. The arc a meets no other circle. 

We now show that we can always modify A’ so as to lose the two inter- 
sections P; and P2. The circle ¢{* and the arc @ divide A’ into two regions. 
Let A be the one which does not contain e/*. Several cases arise. Suppose 
first that A contains no canonical circle at all. Then, obviously, a can be de- 
formed so that the intersections P; and P? are lost. 

Suppose next that A contains only one canonical circle. Let g be a circuit 
in A’—A which lies very close to a and that part of e{* bounding A’—A. 
Then if we modify A’ by a move of Type IC in which we sever A’ along g and 
patch it up again by matching e/* and e/*, we find that the two intersections 
P, and P: have disappeared. 

Suppose, finally, that A contains several canonical circles. Choose g as 
before and again sever A’ along it and patch it up again along corresponding 
points of e{* and e/*. Again, we lose the two intersections, P; and P», and, 
by Lemma 3, this operation is the product of moves of Type IC. Hence since 
we can remove all the intersections two by two this case is reduced to the 
former and the theorem is completely proved. 

4. Theorem 4 states that two Heegaard diagrams arising from two choices 
of the canonical 2-cells of Re (see (c), §1) are equivalent. In the previous 
theorem, the circles on A and A’ were images of different e’s; in this theorem, 
the ¢’s are images of different f’s. 


4 

+) 


1933] THREE-DIMENSIONAL MANIFOLDS 107 


Tueorem 4. If {f} and {f’} are two sets of canonical curves for the same 
region Rz of M, then any diagram A=(A, «, ) arising from L, {e} and {f} 
is equivalent to any diagram A’ =(A’, arising from L, {e} and {f'}. 

This theorem follows at once from Corollaries 1 and 2 of Theorem 2 and 
Theorem 3 by mapping A and A’ on J and A’ as in Corollary 1. 

5. We have proved, Theorems 2, 3, and 4, that any two Heegaard dia- 
grams arising from a manifold are equivalent, provided that we chose the 
same canonical surface in each case. We have left to prove that any two dia- 
grams whatsoever arising from a manifold are equivalent (see (a), §1). To 
prove this we make use of a special canonical surface whose construction is 
given below. 

6. Construction of the special canonical surface. Let M be a manifold 
which we assume simplicial, G the linear graph consisting of all the 0- and 1- 
simplexes of M, H a subcomplex of G. Let M’ be the first derived complex of 
M, and M”’ the first derived complex of M’; and let G’, G’’, H’, H’’ be the 
induced subdivisions on G and H, respectively. It may happen that the 
boundary of the M’’-neighborhood of H is a canonical surface, in which case 
we call it a special canonical surface and any diagram arising from it a special 
Heegaard diagram. In particular, if H is G itself, we prove 


Lemma 9. The boundary, L, of the M''-neighborhood of G is a special 
canonical surface. 

Let M* be the dual of M, G* the linear graph consisting of all the 0- and 
1-cells of M*, and G*’ the subdivision of G* induced by M’. Further, let R; 
and R: be the M’’-neighborhoods of G and G*, respectively. We prove that L 
is a canonical surface by showing 

(a) that every 3-simplex of M’’ belongs either to R; or Ro; 

(b) Lis the common boundary of R; and R2, and is a closed and connected 
2-dimensional manifold; 

(c) Ri and R: are canonical regions. 
It then follows from our definitions (Part I, §4) that Z is a canonical sur- 
face dividing M into the canonical regions R,; and R2. 

(a) Let A¢ be the vertices of M; B,’, B¥, B;' the vertices of M’ on the 
1-, 2- and 3-cells of M, respectively, and C,’, C2", C3‘ the vertices of M’’ on the 
1-, 2- and 3-cells of M’, respectively. Every 3-simplex of M’ is of the form 
A every 3-simplex of is of the form A oC,C2C; or B,CiC2C3, where 
a=1, 2, or 3. Hence every 3-simplex of M”’ is incident with an A, or a By, i.e. 
a vertex of G’ or else with a By or a B;, i.e. a vertex of G*’. That is, every 3- 
simplex of M’’ belongs to R; or to Ro. 


t See second footnote on page 89. 


108 JAMES SINGER [January 


(b) In any 3-simplex, o#, of M’, the component 3-simplexes of R; and R, 
are grouped as the simplexes of §2 of Part I, hence in any 3-simplex of M’, 
the boundaries of R; and R; have in common a 2-cell which we call E,*. The 
boundary of £;' is a circuit which may be thought of as composed of four 
1-cells, each on one of the faces of oj. Let us call the 1-cell on the 2-simplex 
o2/, E;4. It is quite clear that the incidence matrix of the 2- and 3-simplexes of 
M’ is the same as the incidence matrix of all the 1- and 2-cells E,/ and E;. 
It follows at once from the fact that M is a manifold that L=)>E;‘ (the sum 
being taken over all i’s) is a closed, connected 2-dimensional cellular manifold, 
from which it can be deduced that, upon subdivision, LZ will become a closed, 
connected 2-dimensional simplicial manifold. Furthermore, it follows that L 
is orientable or non-orientable according as M is orientable or non-orientable, 
and conversely. | 

(c) Since the M’’-neighborhood of a 1-simplex of M is a 3-cell, the M’’- 
neighborhood of any tree of 1-simplexes of M is also a 3-cell. Let the linear 
graph G contain p independent circuits; then the removal of p properly chosen 
1-simplexes, say o}, o?, - - - , o”, from G will reduce it to a tree T. Let By 
be the vertex of M’ on o;' (i=1,- +--+, p) and let E‘ be the aggregate of 2- 
simplexes of M’’ that lie in the 2-cell of M* dual to o;* and are incident with 
Bi(i=1,---, p). The M’’-neighborhood of T is a 3-cell and if we add to it 
the remainder of the M’’-neighborhood of G excepting the 2-cells E',---, 
E>», we obtain a 3-cell. Hence R; is a canonical region. 

By exactly the same argument it can be shown that R; is also a canonical 
region. It therefore follows that L is a canonical surface dividing the manifold 
M into the two canonical regions, R; and Re, as we wished to prove. 

7. We have proved incidentally that Z and M are together orientable or 
non-orientable. Also, since the number of 2-cells which when removed re- 
duce R;, to a 3-cell is equal to the maximum number of non-intersecting cir- 
cuits that can be drawn on L without disconnecting it, it follows that the 
number of 2-cells removed from R; is equal to the number removed from R;. 
In other words, the cyclomatic number of G is equal to the cyclomatic num- 
ber of G*f. 

8. We prove the following lemma: 


Lemna 10. Let H and J be subcomplexes of the linear graph G of a manifold 
M such that it is possible to build up H from J by successively adding to J 
closed 1-simplexes of G in such fashion that at every step the 1-simplex which is 
being added either has one and only one end point in common with the subgraph 
already built up or else is the third side of a 2-simplex of which the other two sides 


+ The cyclomatic number of a linear graph is the number of independent 1-circuits on it, i.e. 
the minimum number of 1-simplexes that can be removed which reduce the graph to a tree. 


1933] THREE-DIMENSIONAL MANIFOLDS 109 


already belong to the subgraph. Then if the boundary of the M''-neighborhood of 
J is a canonical surface, so is the boundary of the M'’-neighborhood of H, and 
conversely. 


The proof of this lemma follows by induction. Suppose that at the mth 
step we have built up the subgraph J, from J and that the boundary of the 
M"’-neighborhood of J, is a canonical surface of M. We add the closed 1- 
simplex o;*+! to J,, obtaining the subgraph J,,:. By hypothesis, either 
o**! has only one end point in common with J,, or else it completes a tri- 
angle, of which the other two sides are already in J,. It is obvious in the first 
case that the boundary of the M’’-neighborhood of J,4: is a canonical surface 
of M. 

Let then o;*+! complete a triangle of which the other two sides belong to 
J,. Then if we call R, the M’’-neighborhood of J,, L its boundary and R;, the 
remainder of M, that part of o;"+! lying in R; can play the réle of A of Lemma 
2. Hence, applying Lemma 2, we obtain a new canonical surface of M which 
is precisely the boundary of the M’’-neighborhood of J,4:. It follows that the 
boundary of the M’’-neighborhood of H is a canonical surface of M. 

Conversely, let us suppose that the boundary of the M’’-neighborhood of 
Jn4: is a canonical surface. Again, if the 1-simplex o*+! which was added to 
J, to form J,,4: has only one end point in common with J,, then the boundary 
of the M’’-neighborhood of J, is a canonical surface of M. 

In the second case, let us call that part of the 2-simplex whose boundary 
o+! completes lying in Ri, E, and F that part of the 2-cell dual to a+ lying 
in Re. We can choose E and F canonical 2-cells of R; and Re, respectively, and 
so arrange that their boundaries (which meet once and only once) do not meet 
any of the other canonical curves. Lemma 1 is then applicable, and therefore 
the boundary of the M’’-neighborhood of J, is a canonical surface of M. It 
follows that the boundary of the M’’-neighborhood of J is a canonical surface 
of M and the lemma is proved. 

9. Let L and L’ be the boundaries of the M’’-neighborhoods of H and 
J, respectively, as above, and suppose that one (and hence the other) is a 
canonical surface. As an immediate consequence of Lemma 10 we have 


CoroLiary 1. Any special Heegaard diagram arising from L is equivalent 
to any special Heegaard diagram arising from L’. 


The proof follows at once from the fact that when we pass from J, to 
Jn+1, we either do not change the Heegaard diagram at all, or else we modify 
it by a move of Type III. 

10. At this point it becomes necessary to restrict the notion of homeomor- 
phism. Let M and M, be two manifolds which are not only homeomorphic 


110 JAMES SINGER [January 


but also equivalent in the sense of semi-linear analysis situs, i.e., it is possible 
to subdivide each of them so that the subdivisions have identical structures 
(cf. Alexander, J. W., The combinatorial theory of complexes, Annals of 
Mathematics, (2), vol. 31 (1930)). M and M, will then be called equivalent 
manifolds. 

From this definition and from Lemma 10 and its corollary follows at once 


THEOREM 5. Jf M and M, are equivalent manifolds, G and Gp their linear 
graphs, L and Ly the boundaries of the M''- and M ,''-neighborhoods of G and Go, 
respectively, then any special Heegaard diagram of M arising from L is equiva- 
lent to any special Heegaard diagram of My arising from Lo. 


11. We have one more theorem to prove before we can prove our objec- 
tive, Theorem 7. 


THEOREM 6. Let G be the linear graph of a manifold M, Ly the boundary of 
the M''-neighborhood of G, and L any canonical surface whatsoever of M. Then 
any Heegaard diagram A of M arising from L is equivalent to any special Hee- 
gaard diagram A, arising from Lo. 


Let us note that if the canonical surface L is isotopic to a surface L’, then 
L’ is also a canonical surface of M. 

Let LZ divide the manifold M into the canonical regions R; and R: and 
let E,', - - - , E.” be aset of canonical 2-cells for Ri. After subdivision of M, 
if necessary, we can group the simplexes of M into cells, so that R; is composed 
of p+1 3-cells, EH}, -- - , E+, where Ef is bounded by the 2-cell Z and 
EZ (a 2-cell isotopic to £;*) and 2-simplexes on L, for i=1, 2, +--+, p; and 
is a 3-cell bounded by >> (EZ + E/*) and simplexes on L. By elementary 
operations we can further change the structure of M so that all of the 3-cells 
E}, +--+, and all of the 2-cells ---, Ei',---, E/?, become 
stars of simplexes having for centers the 0-cells E¢,---, Eo?t!, Pd, ---, 
P»,and Pj'!,---,Pé?, respectively. The manifold M is thus transformed 
into an equivalent manifold NV. Let L, be the image of Z in N. 

Consider the linear graph H of N, where 


Pp 
H = + Po'Eo' + + Pj 


i=1 


in which E,?+1P,', etc., stands for the 1-simplex joining E,?+ and P,*. By 
construction, G, , the linear graph of NV can be built up from H as is required 
in Lemma 10. Hence any special Heegaard diagram of NV, say Ay, that arises 
from the boundary of the NV’’-neighborhood of G, is equivalent to any special 
Heegaard diagram that arises from the boundary of the N’’-neighborhood of 


| 
{ 


1933] THREE-DIMENSIONAL MANIFOLDS 111 


H. But this latter surface is isotopic to L,. Hence Ay is equivalent to any 
Heegaard diagram arising from L,. But since L, is the image of L, A is such 
a diagram, hence A and Ay are equivalent. But by Theorem 5, Ay and Ay 
are equivalent, therefore A and A, are equivalent as we wished to prove. 

12. From Theorems 2, 3, 4, 5, and 6 follows the all inclusive theorem 


THEOREM 7. If A and A’ are any two Heegaard diagrams whatsoever arising 
from a manifold M, then A and A’ are equivalent. 
V. THE GENERAL THEOREM 
1. Several theorems follow immediately from those of Part IV. 


THEOREM 8. If the equivalent manifolds M and N give rise to the Heegaard 
diagrams A and A’, then A and A’ are equivalent. 


THEOREM 9. If the diagram A gives rise to the manifold M, which in turn 
gives rise to the diagram A’, then A and A’ are equivalent. 


From Theorem 9 and the preceding theorems follows 

THEOREM 10. If the Heegaard diagrams A and A’ give rise to equivalent 
manifolds, then A and A’ are equivalent. 

THEOREM 11. Jf the manifolds M and N give rise to equivalent diagrams, 
then M and N are equivalent. 

From Theorems 1, 8, 10, and 11 follows the general theorem 

THEOREM 12. Equivalent manifolds give rise to and arise from equivalent 
Heegaard diagrams and equivalent Heegaard diagrams give rise to and arise 
from equivalent manifolds. 


The problem of determining when two given 3-dimensional manifolds are 
equivalent (in the sense that we can pass from one to the other by elementary 
operations) is thus reduced to the problem of determining when two given 
Heegaard diagrams are equivalent (in the sense that we can pass from one 
to the other by means of the moves defined in Part IIT). 


PRINCETON UNIVERSITY, 
PRINCETON, N. J. 


NON-CYCLIC ALGEBRAS OF DEGREE AND EXPONENT 
FOUR* 


BY 
A. ADRIAN ALBERT 


1. Introduction. I have recently} proved the existence of non-cyclic 
normal division algebras. The algebras I constructed are algebras A of order 
sixteen (degree four, so that every quantity of A is contained in some quartic 
sub-field of A) containing no cyclic quartic sub-field and hence not of the 
cyclic (Dickson) type. But each A is expressible as a direct product of two 
(cyclic) algebras of degree two (order four). Hence the question of the exist- 
ence of non-cyclic algebras mot direct products of cyclic algebras, and there- 
fore of essentially more complex structures than cyclic algebras, has remained 
unanswered. 

The exponent of a normal division algebra A is the least integer e such 
that A* is a total matric algebra. A normal division algebra of degree four has 
exponent two or four according as it is or is not expressible as a direct product 
of algebras of degree two.{ I shall prove here that there exist non-cyclic nor- 
mal division algebras of degree and exponent four, algebras of a more com- 
plex structure than any previously constructed normal division algebras. 

2. Algebras of order sixteen. We shall consider normal simple algebras of 
order sixteen (degree four) over a field K. Algebra A has a quartic sub-field 
K(u, v) where 
(1) w@=p, (oe, in K), 


such that neither p, o, nor gp is the square of any quantity of K. Algebra A 
contains quantities 

Jiy Js = 
such that 
(2) jr = g1 = vot 0 (v1, v2 in K), 
(3) jv = jo, = —Ujo, = g2= 73+ vv (ys, in K), 
(4) joji = = = ¥s+ (vs, Ye in K), 


* Presented to the Society, August 31, 1932; received by the editors June 9, 1932. 

{ In a paper published in the Bulletin of the American Mathematical Society, June, 1932. 
(Designated by Albert 1.) 

t See Theorem 6 of my Normal division algebras of degree four, etc., these Transactions, vol. 34 
(1932), pp. 363-372. (Designated by Albert 2.) 


112 


NON-CYCLIC ALGEBRAS 


— Yeuv 


(m+ vot) (Ys — 42) 


(5) 


A necessary and sufficient condition that A be associative is that 
(6) v8 — = — — 


A necessary and sufficient condition* that A be not expressible as a direct 
product of two algebras of degree two (that is, have exponent four) is that the 
equation 
(7) a? — — — y2p)a? = 0 
be impossible for any a1, a2, a3 not all zero and in K. 

Algebraf{ A has a sub-algebra B= (1, 2, 71, 071) over K(u). This algebra is a 
generalized quaternion algebra and it is well known that B is a division alge- 
bra if and only if 


(8) g1 ¥ af — avo 


for any a; and a in K(u). But if a:=a:+oa2u, d2=a;+au, the equation 
gi =a? —aZo implies that y1+2u = [a? +a?p—o(a? 
so that y1=a? +a?p—a(a? +a2p). We have now 

THEOREM 1. A sufficient condition that B be a division algebra is that the 
quadratic form 


(9) Q (a? + a? p) o(az + a? p) — 


in the variables a4, - - + , as Shall not vanish for any ai, - + + , as not all zero and 
in K. 

For if the sufficient condition of Theorem 1 were satisfied and yet B were 
not a division algebra we would have yi:=a?+a/p—c(a?+a?p) so that 
Q=0 for a1, ae, a3, ag in K and as=1, a contradiction. 

It is also known{ that, when B is a division algebra, A is also a division 
algebra if and only if there is no quantity X in B for which 


(10) g2 = X'X, 


where if X =b+dj, then X’=b(—u)+d(—x)aj; with a and b of course in 
K(u, 2). 


* See Albert 2. 

T For the properties of this section see my paper in these Transactions, vol. 32 (1930), pp. 171- 
195. (Designated hereafter by Albert 3.) 

t See L. E. Dickson’s Algebren und ihre Zahlentheorie, p. 64, for both the condition that B be 
a division algebra and A be a division algebra. 


113 


114 A. A. ALBERT [January 


I have proved* that 
(11) (bj2)? = fat far, = fs + four, 
where if 
(12) b = Bi + + (83 + Bav)u, d = 6, + + (63 + 
and 
(13) b, = BP + B2o — p(BF + Bec), be = — 
(14) d, = 62 + — p(6? + 520p), de = 2(6:52 — opd3d,), 
then 
(15) fs = + fa = Diva + 
fs = dyys + doapye, fe = dyye + doys. 
I have also shown that if go=X’X then 
(16) fat — 
But then y3b2= —~yahi, Ysd2= so that from (162), (15), 
(17) — veo) = (ve — + — op) 
If A is associative then (6) is satisfied. Also g2+0 so that go(—v) #0, y? —y2oe 
~0. Then (17) is equivalent to 
(18) = Ysbi + yadily? — 
As in the proof of Theorem 1 we have immediately 

THEOREM 2. A sufficient condition that A with division sub-algebra B be a 
division algebra is that the quadratic form 

Q = ys[(a? + — plas? + a?) | 


(19) 
+ — v2 p)[(ak + — pla? + asap) | — ysvsas? 


shall not vanish for any oy, - + + , a not all zero and in K. 


3. Algebras over K(q). Let L=K(q) be a quadratic field over K where 


(20) = 62 +62 (6, and 42 in K). 


It is well known that if K contains no quantity & such that k? = —1 then every 
cyclic quartic field over K contains a quadratic sub-field L of the above type. 
Hence a sufficient condition that an algebra of degree four be non-cyclic is 
that A contain no quadratic sub-field Z as above. But also A contains no sub- 


* Albert 3, p. 178. 


1933] NON-CYCLIC ALGEBRAS 115 


field equivalent to any given quadratic field L if and only if A XL is a division 
algebra.* Hence we have 


THEOREM 3. If no k in K has the property k? = —1, a sufficient condition 
that a normal simple algebra A of order sixteen over K be a non-cyclic normal 
division algebra is that AXL be a division algebra for every quadratic field 
L=K(q), 

(21) g=6=6f +67 (5; and in K). 


We shall apply Theorem 3 as follows. We shall choose a particular field of 
reference, K. We shall then define A by a choice of p, o, 71, - + - , Ys. Then 
also A XL is evidently a normal simple algebra (of the same kind as A over 
K) over L when we show that neither p, o, nor gp is the square of any quan- 
tity of L (not merely K). We shall then prove that A(not A XL which can 
have exponent two) has exponent four, while A XZ is a division algebra. 
This latter step will be an application of Theorems 1 and 2 applied to 
A XL over L. The algebras A over K will be non-cyclic algebras of exponent 
four by Theorem 3. 

4. The field K. Let F be any real number field, and let x, y, and z be inde- 
pendent marks (indeterminates). The field F(x, y, z)=K is a function field 
consisting of all rational functions with (real) coefficients in F of x, y, 2. We 
shall deal with quadratic forms Q and equations Q =0 so that we shall always 
be able to delete denominators and hence take our quantities in 


J =F[x, y, z], 


the domain of integrity consisting of all polynomials in x, y, z with coefficients 
in F. We shall of course also consider the domains F [x], F[x, y], etc. 

Consider a field K(q) as in §3. It is evident that the quantity g defining 
such a quadratic field may always be chosen so that 6, 6:, 52 are in J. Also in 
a quadratic form Q =0 with coefficients in J and variables over K(g) we may 
always take the variables to be in the domain of integrity J[q] of all quan- 
tities of the form 

a+ bq 
where a and dare in J. 

Every quantity a=a(x, y, z) of J has a highest power 2" with coefficient 
in F |x, y] not identically zero. We shall call the z-degree of a, the coefficient 
of 2” the z-leading coefficient of a. Similarly a has an x-degree, y-degree, x- 
leading coefficient, y-leading coefficient. A restriction of the z-degree of a cer- 
tain expression and its z-leading coefficient evidently does not affect its 
x-degree, etc. 


* Cf. Albert 1. 


116 A. A. ALBERT [January 


If the coefficient of 2" above is b(y, x) and the coefficient of the highest 
power y” of y in b is c(x), then m is called the (z, y)-degree of a, c(x) the 
(z, y)-leading coefficient of a. Finally the degree of c(x) is the (z, y, x)-degree 
of a, its leading coefficient in F, the (z, y, x)-leading coefficient of a. 

We have similarly (x, y, z)-degree and leading coefficient, etc. Using these 
definitions an elementary result is 


Lemna 1. The field K contains no quantity k such that k? = —1. 


For let k2= —1. Then rk=s, where r and s are in J and are both not zero. 
It follows that s?= —r?. The (x, y, z)-leading coefficient of s? is evidently a 
real square and is positive, that of —s*, negative so that the polynomial 
identity r? = —s* is impossible. 


Lemma 2. There exist quantities d, up in F[x, y] such that \*+ ? is not the 
square of any quantity of F(x, y). 


We prove the above lemma with the example \=x, u=y. If x?+y?=5?, 
where 6 is a rational function of x and y, it is evident that b must be a poly- 
nomial in x and y. For the square of a rational function in its lowest terms and 
with denominator not unity is never a polynomial. Hence we may put 
b=b,x+be where 6, is in F[y], 6; merely in F[x, y]. Then x?+y?=b?x? 
+2bibox+b? identically in x and y. It follows that 5? =y?, b=+y. Then 
x? =b?x*+2bxy. Hence b; divides x and is a power of x. But then +(2d:)y 


=x—b?x in F[x], b, in F(x), which is impossible. 

5. The S-polynomials. The quadratic forms (9), (19) over LZ shall be 
treated as follows. If Q=} -a?d; with \; in J (not in J[q]) vanishes for a; in L 
and not all zero, then obviously, by multiplying Q by the square of the least 
common denominator, not zero and in J, of the a;=au+anq (au, a2 in K), 
we shall have 0 =0 for a; in J[q], that is, a and az in J. But then 


Q= DYril(ai? + + = 0 


so that 
0, 


where 
(22) Si = (ais)? + + 


We shall call a polynomial of the form (22) an S-polynomial. All such 
polynomials have the properties that all their degrees are even, all their 
( ? , )-leading coefficients positive. Moreover such a polynomial 
is zero if and only if a; =a: =az2=0. Hence we have 


1933] NON-CYCLIC ALGEBRAS 117 


Lema 3. A sufficient condition that a quadratic form >a? with d; in J 
shall not vanish for any a; not all zero and in K(q) is that >-d,S; shall not vanish 
for any S-polynomials S; not all zero. 


6. The multiplication constants of A. We now choose p, ¢, y1, - - - , Ye in 
J. We shall take 
(23) ao of even 2-degree, even (2, y)-degree, odd (z, y, x)-degree. 

We shall define +; and ¥; in terms of certain quantities «, €;, where 
(24) (the z-degree of €5 is odd) > (z-degree of exys); 

(25) (the z-degree of 3 is odd) > (z-degree of ys); 

(26) (the 2-degree of y2) > (z-degree of yea); 

(27) the (2, y)-degree of 3 even, of €5 odd. 

The above conditions are restrictions merely on the z-leading coefficients 
of our quantities. By making the corresponding 2z-degrees sufficiently large 
we evidently only restrict a single term in each quantity, satisfy the above 
conditions, and yet permit any desired inequalities between x-degrees, 
y-degrees of the same quantities. Moreover ( ‘ ; )-leading coeffi- 
cients other than the (z, : )-leading coefficients may be taken to have 
any desired sign, and the evenness or oddness of ( : )-degrees, etc., 
other than those already given above are still at our choice. We therefore may 
continue with 
(28) o of even y-degree, odd (y, x)-degree; 

(29) (y-degree of €, odd) > (y-degree of €s); 
(30) (y-degree of v2) > (y-degree of yee); 
(31) (y-degree of v3) > (y-degree of yi); 
(32) o of odd x-degree. 

Let the x-leading coefficient of ys be m, that of yey, be m2 such that 
(33) re for any d of F(y, z). 


This restriction may be satisfied by Lemma 2 and there merely restricts the 
x-leading coefficients of ys. and yzys. Also take 


(34) (x-degree of ys) = (x-degree of yas) > (x-degree of yzys), 


that is, the x-degree of y, greater than the x-degree of ys, and, if we desire, 
the x-leading coefficient of yz unity, that of 7, y, that of 6, z, and (33) is 
satisfied. 


118 A. A. ALBERT [January 


Finally let 
(35) (vv — — a, 
(36) (vy? — veo) — es |, 
(37) = €1€, Ys = €5€. 
Then 
— = eve? — 


= ee? ly? (v3? — veo) — veo] — (y3? — veo) + 27 


v2 — = el (vss)? — (vee1)?o}. 


ese? — op 


(v? = eyerero + eye oes” eye cer (vy? vic) 


By (38) we have 


TueoreM 4. If p, 0, 71, Ye are chosen as in (35), (36), (37), the cor- 
responding algebra A satisfies 


(39) — = — v2 — Yeo) 


and is associative. 


7. Elementary properties. In (25) we chose the z-degree of y; to be greater 
than the z-degree of y.0. In (26) we took the z-degree of 2 greater than that 
of y«o. It now follows that the only term of e containing its highest power of 
zis (yzys)?. Similarly, by (24), (25) the term of [2 (y? —y2c) — e? ] containing 
its highest power of z is — ¢,?. Hence the term of p containing its highest power 
of Zz is (yovs6s)?. 

Lemna 4. The z-degree of p is positive, even, and the 2-leading coefficient of p 
is the negative of a perfect square. 

Consider the y-degree of p. By (31) the y-degree of y? —y#a is positive and 
its y-leading coefficient is a perfect square (in y;7). By (35) the leading y-term 
of e is then in (yzys)?, while the leading y-term of €? (y? —y2c)—e? is then 
in (€vys)*. Hence the term of p containing its highest power of y is (eryzy?)?. 

Lema 5. The y-degree of p is positive and even, and its y-leading coefficient 
is a perfect square. 


and 
(38) 
Also 
1° = 


1933] NON-CYCLIC ALGEBRAS 119 


Consider the x-degree of e. We have taken the x-degree of ys equal to the 
x-degree of yxys and the x-degree of , greater than the x-degree of y;. But 
e= —[(yxvs)? ]o+ (yevs)*. Hence the x-leading coefficient of e is the prod- 
uct of the x-leading coefficient of —o by 7? +7. But the x-degree of o has 
been taken odd. 


Lema 6. Let o» be the x-leading coefficient of «. Then the x-leading coefficient 
of eis —oo(r2 +72) and the x-degree of e is a positive odd integer. 


The quantity y2 —7p is determined by (38). We shall require 
Lemma 7. The 2-degrees of y2 are all even. 


For proof we notice that we have already shown that the z-degree of e is 
even, in fact the leading term of e when arranged according to powers of zis a 
perfect square. Also we have taken the z-degree of (y2¢;)? greater than that 
of (ye:)*a. Hence the z-degree of y? —y?p is even. In fact its z-leading coeffi- 
cient occurs only in (y# esys)? and is a perfect square, so that all its z-degrees 
are even. 

One of the properties required in our definition of A is that neither p, o, 
nor gp shall be the square of any quantities of K. We shall prove 


Lemna 8. Neither p, a, nor op is the square of any quantity of K(q). 


For let p=a? where a is in K(g). Then pa=d where is in J[q] and yp is 
in J. Then pu?=? in J. A quantity \ of K(qg) has its square in K if and only 
if it is either in K or a multiple of g by a quantity of k. If \ in J[q] is in K 
then d is in J so that pu? =i? is impossible because the (z, y, x)-leading coeffi- 
cient of p and hence py? is negative while that of \? is positive. Hence \ = vg 
with v in J. Then \?=v%6 is an S-polynomial and cannot be identical with 
pu? of negative (z, y, x)-leading coefficient. 

Similarly oa? where we now use the property that o has odd x-degree. 
Finally by (28) and Lemma 5 gp has odd (y, ~)-degree and op ¥a? for any a 
of K(q). 


Coro.iary 1. The quantities p, a, op are not the squares of any quantities of 
K. 


It follows from Corollary 1 that K(u, v) is a quartic field over K and that 
gi=0 if and only if y:=y2=0. By Lemma 7, g:+0. Also (31) implies that 
g20, while the associativity condition (38) implies that g;~0. 

8. The exponent of A. We shall use (7) to prove that A has exponent 
four, that is, A is not a direct product of two algebras of degree two. Assume 
that A has not exponent four so that (7) is satisfied for a1, a2, a3 in K and not 
all zero. As we have already remarked we maytake ai, a2, a3 inJ. If az=a;=0, 


A. A. ALBERT [January 


(7) a? — ava = — 


implies that a? =a,=0, a contradiction. Hence if as=0 then a2:¥0 and 
o = (a,a2~')?, a contradiction of Corollary 1. Thus a;+~0. 

By Lemma 7 y? so that (y2es)?— (yee1)? #0. The equation 
vy? —y?p=he gives 

(a? — = (ash)*e. 

Let =ash Bi Then, as may be easily 
computed,* 
(40) BY — = eB? (8; ¥ 0, Bi, Bo, Bs in J). 


But then 8? =08?+e8?. The x-leading coefficient of e8? has the form 
—o +72 by Lemma 6. The x-leading coefficient of has the form 
oo6e?. But (72+72)6;¢ #0 is not the square of any quantity of K(y, 2). 
Hence the x-leading coefficient of o8.? +e8? is not zero. But the x-degree of 
this expression is odd since o has odd x-degree, e has odd x-degree, 6;~0. It 
follows that (40) is impossible for 8; 0, a contradiction. 

9. The first norm condition. We wish to prove that algebra B is a division 
algebra, that is, prove that g:~a-a(—v) for any a of K(u, v), the so called 
first norm condition. As we have shown this condition will be satisfied if we 
can show that the equation 


(41) Si + Sop — o(S3 + Sip) = ¥i5s 


is impossible for S-polynomials S;, - - - , Ss not all zero, a consequence of §5 
applied to (9). 

By Lemma 2 the y-degree of p is even and the (y, z, «)-leading coefficient 
of p is positive. Also the y-degree of o is even. Hence the y-degree of each of 
Si, Sop, Sz, Sup is even. But the (y, 2, x)-leading coefficients of these terms are 
all positive. Moreover S:+S2p, Sip have even (y, z)-degree, while o has 
odd (y, z)-degree. Hence the (y, z)-degree of Si+S.p — o(S;+S,) is either 
even or odd according as the (y, z)-degree of S:+Szp is greater or less than the 
(y, z)-degree of (S;+.S.p)o. In any case the corresponding (y, z, x)-leading coef- 
ficient is zero if and only if S;=S:;=5S;=S,=0. We have shown that 
T =S,+S2p—o(S3+Sip) has even y-degree and (y, z, x)-leading coefficient 
zero if and only if S;=0 (¢=1, - - - ,4). 

By (35), (30), (31) the y-degree of e is even. By (37), (29) the y-degree of 
7 is odd. Hence the y-degree of 1S; is odd unless $;=0. But y:S;=T has 
even y-degree. Hence S;=0, T=0, T has (y, z, x)-leading coefficient zero so 
that S;=0 (¢=1,--+-,5). 

* That is, let b= y265+ Then + 
and a-a(—v)-b-b(—v) =(a? —a?c)-h=ab+ab(—v) =B? —B?o. 


120 


1933] NON-CYCLIC ALGEBRAS 121 


10. The second norm condition. This is the condition ge = X’X which, by 
§5 and (19), is satisfied if we can prove that 


(42) 75 [S; + Sx — + + vv? p) [Ss+Seop — pS7— oS] = 37559 


is impossible for S-polynomials S;(i=1, - - - , 9) not all zero. Notice that we 
have replaced pa;*p = (pas)? of (19) by the S-polynomial Sx instead of the 
formally corresponding p*S3. 

By (24) the z-degree of 3 is odd. By the proof of Lemma 4 the z-degree of 
e is even and the z-leading coefficient of ¢ is a perfect square. Applying (27) 
we have 


Lemna 9. The z- and (z, y)-degrees of ys are odd. 


We have taken p to have all even degrees and negative (z, y, x)-leading 
coefficient by Lemma 4. Also o has even z-degree, (z, y)-degree, but odd 
(z, y, x)-degree. Hence the (z, y, x)-leading coefficient of any S;—pS; is posi- 
tive or zero according as not both or both of S;, S; are zero. Hence the 
(z, y, x)-leading coefficient of a combination T=S;—pS;+o(S,—pS;) is zero 
if and only if the four S; are zero. Moreover T has even (z, y)-degree and 
(z, y)-leading coefficient which is identically zero only when all the four S; 
are zero. But the (z, y)-degree of sz is even, the (z, y)-degree of y2 —72p is 
even, while that of y; is odd. Hence the (z, y)-leading coefficient of 


R = y5[(Si — + o(S2 — pSs)| + va(v2 — [Ss — pS; — o(Ss — pSs)| 


is either the (z, y)-leading coefficient of its first bracket or of its second 
bracket, while R has z-leading coefficient identically zero if and only if S;=0 
(i=1,---, 8). But the z-degree of R is odd unless the S; are zero since the 
z-degree of 3 is odd by (25), that of ys odd by Lemma 9. By (42) R=ysysSo 
has even z-degree. Hence R=0, S,=0, and R has z-leading coefficient zero. 
This proves that S;=0 (t=1, - - - ,9) as desired. We have proved 


Lemma 10. Let F be a real number field, x, y, 2 indeterminates, and let A be 
an algebra of order sixteen over K =F(x, y, 2) defined by (1)-(5), (23)-(37). 
Then A is a normal division algebra of degree and exponent four over K, AXL 
is a normal division algebra of degree four over L for every quadratic field 
L=K(q), +67 (61, in K). 


As an immediate corollary of Lemma 10 we then have 


THEOREM. The algebras of Lemma 10 are non-cyclic algebras of degree four 
not expressible as direct products of cyclic algebras of degree two. 


UNIVERSITY OF CHICAGO, 
Cutcaco, 


ON ABELIAN FIELDS* 


BY 
LEONARD CARLITZt 


1. INTRODUCTION 


By Kronecker’st Theorem on Abelian fields, all such fields are subfields 
of cyclotomic fields, that is, fields generated by a root of unity. Abelian 
fields may then be classified by considering all cyclotomic fields and sorting 
the subfields in some manner that will exclude repetition. For example this 
is done, in part at least, by Weber by making use of the notion of primary 
subfields: a subfield of ©,,, the field generated by a primitive mth root of 
unity, is a primary subfield if it is not contained in an ©,,, (m’<m). We here 
make use of what we shall call simple§ (primary) subfields as defined below. 
If then the (known) discriminants of Abelian fields are set up on this basis, a 
number of properties of Abelian fields become apparent. In particular is this 
true of the fields contained in a fixed simple subfield (see §5). 

In §6 some results on common index divisors (that is, common inessential 
discriminantal divisors) are obtained. Using a necessary and sufficient con- 
dition valid for any algebraic field it is shown how to derive for the case of 
Abelian fields very simple criteria that a given rational prime be a common 
index divisor. The criteria are of two kinds. A typical instance of the first 
kind is the following. 

Let g and / be odd primes such that /=1 (mod q); let C denote that 
cyclic subfield of 2, that is of degree g. Then a necessary and sufficient con- 
dition that a prime p(p <q) be a common index divisor of C is that 


= (mod q)- 


As an instance of the criteria of the second kind, we quote the following 
theorem: 


Let K be Abelian of degree g" and type (1,1, - - - ). Then if d is the discrimi- 
nant of K, and if 


* Presented to the Society, November 29, 1930; received by the editors March 11, 1932. 

+ International Research Fellow. 

t See Hilbert, Die Theorie der algebraischen Zahlkirper, Jahresbericht der Deutschen Mathe- 
matiker Vereinigung, vol. 4 (1894-1895) ,Theorem 131. 

§ Called “Ausgangs-Kreiskérper” by M. Gut, Die Zetafunktion, die Klassenzahl und die Kro- 
necker’sche Grensformel eines beliebigen Kreiskérpers, Commentarii Mathematici Helvetici, vol. 1 
(1929), p. 160. 


122 


ON ABELIAN FIELDS 


(i) p does not divide d, p S q”'?; 
(ii) p| d,* p < 
p is surely a common index divisor of K. 


We shall suppose in what follows that all the discriminants are odd unless 
the contrary is explicitly stated; this makes for a considerable simplification 
and avoids listing a great many exceptional cases. 

2. CLASSIFICATION 


Let m be an integer =3, and let ©,, be the field defined by a primitive 
mth root of unity. We suppose the group G, of ©, exhibited by a reduced 
residue system (mod m); @ stands for ¢(m). Let m be divisible by exactly n 
(odd) primes: 


m= eee Gnd”; 
put 


bi = = — 1) 


Let r/ denote fixed primitive roots (mod g,‘), respectively, and let 7; be 
defined (mod m) by ; 
(1) (mod g*), 

(mod q/‘) (i ¥ j). 
Then G, is generated by n, ---, fn: 


(2) Gs = {ri,---, 1a}. 


We now define a simple (primary) subfield of ©, as one corresponding{ 
to a group 


(3) G, = 


where 
(4) Pi = Qi does not divide y; (i 1, n), 


G, is evidently of order »=1 - - - un, and K, the field corresponding to G,, is 
of degree vy =r; - - - v,. That K is indeed primary follows from the second part 
of equations (4). 

It is now a simple matter to exhibit our mode of classification. We notice 
to begin with that any primary subfield & of ©,, is contained, properly or 


* As usual, read for a| b, “a divides b.” 
Tt Hilbert, loc. cit , p. 248. 
t Hilbert, loc. cit., p. 250. 


123 
(i= 1,---,m). 


124 LEONARD CARLITZ [January 


improperly, in a unique minimal simple subfield of ©,,; for it is clear from (2) 
that the greatest common subfield of two simple subfields is itself a simple 
subfield of 2,,. Let us then fix our attention on a particular simple subfield K. 
Choose any maximal subfield £; of K. If k: is primary (with respect to ,,), 
choose ke, some maximal subfield of k1; we continue this process until we ar- 
rive either at another simple subfield or else at a k; none of whose subfields is 
primary. To illustrate the process, let us classify the primary subfields of 
Qm, m=5*-112, 

Let ri, re appertain to 52-4, 11-10, respectively (see (1) above). Then, if 
the group generated by A, B, - - - be denoted by {A, B, - - - }, we get among 
others the following chains: 

I. {1}~Q.,, 


{rire } ~ k, of degree ¢(m)/2, 
ry } ~ ke (simple) of degree ¢(m)/4. 


II. Starting with kz we may choose one of 


50 55 10 111 


{ry } ~ ko; of degree ¢(m)/20 


fr, re } ~ k; (simple) of degree ¢(m)/8; etc. 


III. Starting with k;, we may choose one of 
re , re} ~ of degree ¢(m)/40 


no subfield of any 3; is primary. 
IV. In place of f; (of I) we may take 


20 22% 


{rite ki; of degree ¢(m)/5 
| ~ kd? (simple) of degree ¢(m)/25, 


and contained in each f;/. 

These four chains will suffice to indicate how the classification may be 
carried out in any special case; the utility of this method of arrangement will 
appear below. 

3. THE DISCRIMINANTS 

The form of the discriminant of an Abelian field is known, at least in the 

sense that the discriminant oi any subfield of an ©,, can be explicitly written 


1,---,4), 
or 
(i =1,---,4); 


1933] ON ABELIAN FIELDS 125 


down.* As the explicit expression for the discriminants will be required they 
will be stated here in the form of lemmas. It is convenient, and indeed leads 
to an important result, first to calculate the discriminant of an arbitrary 
simple field, and then proceed to the case of an entirely arbitrary subfield. 


Lemna 1. The discriminant of a simple primary subfield K of Q, is deter- 
mined by 


(5) d(K) 
where 
t; 


(6) si = — 1) — 1), 
and yi, vi, bi, are defined by (4). 


If now & is any primary subfield of ,,, then as seen in §2 it either is itself 
simple or else is contained in a unique minimal simple subfield. Calling this 
field K, and assuming all the above notation for a simple field, we get 


Lemma 2. The relative discriminant of K with respect to k is the unit ideal 
of k. 
Now by a general theoremf 
d(K) = de(k)N(D), 
where p is the relative degree of K/k, D is the relative discriminant of K/k, 
and N(D) denotes the norm in k. Hence Lemmas 1 and 2 immediately imply 


Lemma 3. The discriminant of an arbitrary primary k is determined by 


d(k) = = + [J 
where t; is defined by (6). 


4, THE SUBFIELDS OF A SIMPLE SUBFIELD 


Let us fix some K, a simple subfield of ©,., defined by equations (3) and 
(4), say. We shall consider the set of fields {£} satisfying the following con- 
ditions: 

(i) kis a primary subfield of 2,,; 


* See, for example, Gut, loc. cit. 
¢ Hilbert, loc. cit., Theorem 39, 


1 oS; 
~ 
Kivi 


126 LEONARD CARLITZ [January 


(ii) K is the minimal simple field containing k. We shall call {2} the set 
of fields belonging to K. 

By means of Lemma 3, once we have calculated the discriminant of K, 
we determine at once the discriminant of k, a member of the set of fields 
belonging to K, if we know merely the relative degree of K/k. Furthermore 
if two fields in {k} have the same degree their discriminants must coincide. It is 
not difficult to determine the conditions K must satisfy in order that there 
be several fields of the set of equal degree; however, we shall consider only 
the special case of a K of type (1,1, ---). 

Let K be an Abelian field of degree g” and type (1, 1,---), q an odd 
prime. From Hilbert’s proof of Kronecker’s Theorem on Abelian fields, we 
may deduce that K is a subfield of ,,, where 


qi =1 (mod g) 


according as g does or does not divide the discriminant of K. Evidently K is 
simple only if the number of distinct primes dividing m is equal to 2, i.e. 


Now if K is not simple, it is readily seen that the simple field to which it 
belongs is itself of type (1, 1, - - - ). Let us then assume K simple, and for the 
sake of definiteness let us suppose g|m. Then if the r; are defined as in (1), K 
corresponds to the group (qo =q) 


We can now easily determine the set of fields belonging to K: 
(i) Let us consider first all the cyclic subfields of K; from a well known 
result concerning Abelian groups, we see at once that the number of such 


fields is 
— 1)/@ 1)- 


They may be sorted by considering the number of primes contained in their 
discriminants. There are first of all fields whose discriminants contain but 
a single prime; each corresponds to a subgroup of the type 


q 
Secondly, there are 


n n(n — 1) 
= 


(¢=1,---,d, 


1933] ON ABELIAN FIELDS 127 


fields whose discriminants contain exactly two primes. They fall into 
n(n—1)/2 sets of (g—1) fields, all the fields in a set having the property that 
their discriminants are divisible by the same primes. Thus a particular set 
corresponds to 


@ @ 


ao, =1,---,qg—1, aoa, =1 (mod gq). 


fields whose discriminants are divisible by exactly three primes; they fall into 
n(n—1)(n—2)/6 sets of (g—1)? fields each, all the fields in a set having the 
property that their discriminants contain the same primes. A particular set 
corresponds to 


Thirdly, there are 


{ro, 71,72, 7071 
ao, 41,42 = 1,--+,qg—1, =1 (mod q). 


Finally there are 


nN 
n 


fields whose discriminants contain all m primes; they comprise a single set of 
fields. Each field in the set corresponds to a particular* 


= 1 (mod q). 


It will be convenient for a later application to denote the field corresponding 
to 


The fields k are the only primary (cyclic) subfields of K and hence are 
the only cyclic fields in the set belonging to K. 

(ii) To determine A‘, the number of fields of degree g* (and of type 
(1,1, - - - )) in the set belonging to K, we notice first that the ¢otal number of 
subfields of K of degree g’ (and necessarily of type (1, 1, - - - )) is equal to 

* While it may appear from (7) that ro plays a special rdéle, this is by no means the case. Thus it 


is easily verified that Go----¢n-)) contains all numbers of the form r;*irj~*i and therefore any 7; 
might be used in place of ro in defining the group. 


7 
1 


LEONARD CARLITZ [January 


Then by an argument similar to that employed in the special case (i), we see 

that 


(8) ( 


5 


j 


To solve (8) for A“ we may proceed thus: 


km 
jms J 


Further,transformation of the left member of (9) leads to an unexpected con- 
nection between A‘? and generalisations of certain important quantities in 
finite differences. We make use of the formula (the g-generalisation of the 
binomial theorem) _ 


then 


a 


a=0 


qe 


& 
(10) 


AY 
= 1 a 1 n a(a—1)/2_ 
(g — 1)---q@—1) "4 


Let us think for the moment of g as an arbitrary parameter, and write [x] 
for the so-called “basic” number 


(q? — 1)/(@ — 1) 
which reduces to x when g= 1. Further, defining [m]! by 


128 
n n 
= | | 
so that 
_ 


ON ABELIAN FIELDS 


[m]! = [m][m —1]--- [1], 
(10) becomes 


—a(s—1)/2 


which, but for the (q—1)*~*, is a g-generalisation of what are sometimes called 
Stirling numbers. 

Before leaving the A{” we derive one other important formula connecting 
them. From (9) 


so that the right member of (12) becomes 


= (q* — 1) 


Therefore, finally 
(s—1) (8) 
(13) Anti = An + . 
5. SIMPLE SUBFIELDS AS RELATIVE ABELIAN FIELDS 


We return to the consideration of the general case defined at the beginning 
of §4, that of a simple subfield K and the set of fields {£} belonging to it. By 
Lemma 2, the relative discriminant of K with respect to any & of {k} is the 
unit ideal of &; further it is clear that K/z is relative Abelian. Let us then for 
brevity say that K has the property* A with respect to &. 

*K is of course part of the Klassenkérper of each k. For definition and proof of the existence 
of the Klassenkérper of an arbitrary algebraic field, see Furtwiingler, Mathematische Annalen, 1907, 
pp. 1-37; Takagi, Journal of the College of Science, Imperial University of Tokyo, vol. 41 (1920). 
As no use of the existence of the Klassenkérper is being made here, it is found convenient to use the 
terminology defined above. 


1933] pl 129 

(12) 

but 


130 LEONARD CARLITZ [January 


Let K, ki, - - , where the fields hy, - - - , are all in the set be- 
belonging to K, be a chain of fields as in §2. Then it is clear that each k has 
the property A with respect to any succeeding & of the chain. Conversely, we 
shall now prove that if any Abelian field F have the property A with respect 
to a k of the chain, then F itself is a member of the chain, and lies somewhere 
between K and (possibly at an end). 

By hypothesis the relative discriminant of F'/k is the unit ideal of k, so 
that the only primes dividing the discriminant of F are those dividing the 
discriminant of k and therefore of Q,,, the cyclotomic field of which & is a 
primary subfield. Then F is a primary subfield of an ©,” , where 


(14) m= oes and m’ = qv one CH = fi). 


Let K’ be that simple subfield of @,,, to which F belongs; clearly K’ must 
have the property A with respect to k. Let yi, v;, u, v be the numbers deter- 
mining K (see (3) and (4)); u/, v/, uw’, v’ the corresponding numbers for K’. 
Let p be the relative degree of K/k, w the relative degree of K’/k. Now », v’ 
are the degree of K and K’, respectively, so that 

wy 
(15) —- 

p 

By Lemmas 1 and 3, the discriminants of k and K’ are 
and 

respectively, where 


(16) = —(ss— +1), = pi +1), 
$i 


and s;, s/ are defined by (6). If now we use the fact that the relative dis- 
criminant of K’/ is the unit ideal, 


d(K") d”(k), and tf ™ wt;/p. 
Using this last equality, together with (15) and (16), we get 


si —pit+t 


that is, 


i 
= 1) - qi’ 


(i= 


(17) — 1) 
Since yw; and yu! <q it follows from (17), first, that f? =f;, and then im- 
mediately u/ =y;. But this shows that K’ and K are identical. We may now 
state the theorem. 


1933] ON ABELIAN FIELDS 131 


TueoreM 1. Let K be any simple subfield of Qn, and let {k} be the set of 
fields belonging to K. Then K has the property A with respect to each k. Con- 
versely, any Abelian field that has the property A with respect to some k is nec- 
essarily a subfield of K. 


Some information about the class number of the fields considered can 
be derived from this general theorem.” If F is relative Abelian with respect to G 
and the relative discriminant of F/G is the unit ideal of G, then the class number 
of G is divisible by p, the relative degree of F/G. Actually Hilbert proves the 
theorem only in the case p a prime, but as he remarks there is no great diffi- 
culty in extending the result to the general case. Hence we obtain 

THEOREM 2. Let k be any primary subfield of Qn, and K the minimal simple 
field containing k. Then if p denote the relative degree of K/k, the class number 
of k is a multiple of p. 


6. COMMON INDEX DIVISORS 


A rational prime # is called a common index divisor of an arbitrary 
algebraic field F if, for every integer w of the field, 


d(w) 


where d is the discriminant of F, and d(w) that of w. The following criterion 
deduced from a result of Dedekind’s is given by Hensel. 


Let the prime-ideal decomposition of p in F be 
(18) p= Pie Pr, N(p) = 


Let W(f) denote the number of primary irreducible polynomials (mod p) of 
degree f: 


1 1 
(19) ¥(f) = 7 = rica — —... 
d\f 


Then a necessary and sufficient condition that p be a common index divisor of F 
is that, for at least one i, 


< g(fi), 
g(f) denoting the number of p’s in (18) of degree f. 


* Hilbert, loc. cit., Theorem 94. 
¢ Bachmann, Zahlentheorie V: Allgemeine Arithmetik der Zahlenkir per, 1926, p. 276. 


132 LEONARD CARLITZ [January 


To apply this something must be known about the decomposition of 
primes in the field to be considered. For an Abelian field this information is 
given by another theorem of Dedekind’s.* 


DECOMPOSITION RULE.{ Let 2, be a cyclotomic field and F any subfield. 
Let the group of Qm be represented by a reduced residue system (mod m) and let 
(h) denote the subgroup corresponding to F. Let p* be the highest power of the 
prime p dividing m, m= p*m’; and let the number of those numbers of (h) that 
are =1(mod m’) be (p*)/g, thus defining g. Let f be the smallest positive integer 
such that 


(20) pf = (h) (mod m’), 


that is, to one of the numbers in (h). Then the prime-ideal decomposition of p 
in F is 
= Pe), N(pi) = 
where e-f-g is the degree of F. 
We take first the simplest and perhaps the most interesting case, that of a 
cyclic field C of odd prime degree g and of discriminant divisible by a single 


prime. Then the discriminant is, by Kronecker’s Theorem and Lemma 1, 
either 


(a) git); or (b) gers, 


where / is a prime such that /=1 (mod qg). By the Decomposition Rule (or 
directly, using well known theorems on the decomposition of a prime in a 
Galois field) the condition that a prime # factor in C is either 

(a) pet =1 (mod (p 9); 
(21) or 

(b) (mod J) (p #1). 
Now if p factor in C it factors into g distinct prime ideals (of the first degree). 
Hence, applying the criterion for common index divisors, and noticing that 
¥(1) =p, we deduce one of the theorems stated in the Introduction. 


THEOREM 3. Let C be a cyclic field of prime degree q and of discriminant 
divisible by a single prime. Then a necessary and sufficient condition that a prime 
p(p <q) be a common index divisor of C is furnished by equations (21). 


* Gesammelte Werke, vol. 1, p. 233, 

t This theorem is implicitly proved by Gut, loc. cit., §§5 and 8. 

} For an equivalent criterion for cubic fields see Hensel, Journal fiir Mathematik, vol. 113 
(1894), p. 147. 


1933] ON ABELIAN FIELDS 133 


The necessity of the condition follows from the theorem* that a common 
index divisor of any field is less than the degree of the field. 

Turning now to the case of the general cyclic field C of odd prime degree, 
we remark first that its discriminant, d, is either 


(a) or (b) gi=l (mod q). 
Using the notation of §4 (i), let C correspond to the group 


To determine the condition that a prime p(p does not divide d) factor in C 
we use the Decomposition Rule. We need consider but one case in detail; let 
us take case (a). It is plain that m=q’q; - - - gn, and clearly the condition that 
p factor is that f in (20) be one; or putting 


(22a) (mod m) (1 < 


p factors provided that integers s, ¢ can be found such that 


Qn, a4, —a an ty 


t 


But this congruence is equivalent to the system 
Co = + aiti + - + (mod $(9?)), 
Ci = — Aol (mod =1,---,™m), 
which is equivalent to 
(23a) + anc, = 0 (mod gq), 
the condition sought. 


THEOREM 4. A necessary and sufficient condition that p (p <q) be a common 
index divisor of C = C\%»'*»%) is furnished by (22a) and (23a). Similarly a 
necessary and sufficient condition that p(p <q) be a common index divisor of 

C= C an) 
is furnished by 
(22b) P= 
and 


(23b) ae, +--++ a,c, =0 


* Proved by von Zylinski, Mathematische Annalen, vol. 73 (1913), p. 273. 


t 

i 

fi 

i 

i 

(mod q). 
i 


134 LEONARD CARLITZ [January 


Turning next to the simple field K defined by (3) and (4), the Decomposi- 
tion Rule shows that if » does not divide d(K), 


P = Pe N(pi) =f, ef =v; 


and 


, w = L.C.M.(,--- 


If then ¥(f), the number of primary irreducible polynomials (mod #) of degree 
f, is less than e, p is a common index divisor of K. But evidently 


P*; 


and 


If then wp* <r, surely ¥(f) <e. Hence we have 


THEOREM 5. Let K be the simple field of degree v defined by (3) and (4). If 
p does not divide d(K), and 


(24) wp? <v, w = L.C.M.(n1, 


then p is surely a common index divisor of K. The inequality (24) may be replaced 
by the weaker condition 


(24)’ Max ¥(f) < ». 


Theorem 5 could without much difficulty be refined in several directions. 
And it would also be possible to frame a great many theorems analogous to 
Theorem 4 for various kinds of Abelian fields. However we shall limit our- 
selves to the case of fields of type (1, 1, - - - ). Assume first that the prime p 
does not divide the discriminant of the field. Then by the Decomposition Rule 
or directly it may easily be shown that either 


(i) = py, each p of degree 1; 
or 
(ii) = Part, each p of degree g; 


the field being of degree g*. If p divides the discriminant, we get, in place of 
(i) and (ii), 


(iii) = each p of degree 1; 


or, 


v v 
e=—2-—:- 
f w 
|__| 


ON ABELIAN FIELDS 


= (pi- - + each p of degree 


¥(1) = p, and ¥(q) -_ 


Application of the Hensel criterion leads to 
THEOREM 6. Let K be of degree g" and type (1, 1, 
(i) p does not divided(K), p S q"!%; 
or 
(ii) p| 4(K), ps 
p is surely a common index divisor of K. 


It is perhaps worth remarking that in Theorem 6 either g or d(K) may be 
even. 

Theorem 6 evidently implies that, if g be fixed, then, for sufficiently large 
n, an assigned prime # will be a common index divisor in any field of type 
(1, 1, - - - to units). Thus for example the primes 2, 3, 5, 7 are common 
index divisors of 


k((— 3)¥2, 51/2, (— 1312, 17/2, (— 19)12), 


We consider finally a refined form of Theorem 6 for the case in which 
the (odd) discriminant is divisible by exactly » primes. The field is then sim- 
ple. To determine the decomposition of rational primes in such a field we 
could of course apply once more the decomposition rule. It is however some- 
what simpler and perhaps more interesting to proceed differently. The field 
K under consideration is, by Kronecker’s Theorem, composed of the cyclic 
fields C(q:), each of degree g and of discriminant a power of g;. Here q; is either 
a prime =1 (mod q); or, if g|d(K), one of them is g?. From Theorem 3 we 
already know when a prime p(p+g@;) will factor in C(q;); as for g; we have of 
course (in C(q,)) either 

gi = 9, 
or 
q = for gi = q?. 
Now p may decompose in K in one of four possible ways (see the proof of the 
preceding theorem). It is now fairly clear that if p does not divide d(K), and 


(25) = 1 (modg;) for i=1,---,n, 
then 


1933] 135 

(iv) 

Now 

| 
q 

| 

| 


LEONARD CARLITZ 


pe”, N(pi) = 2; 
if (25) fails for at least one 7, then 
P= N(pi) = 

_ if p|d(K), and 
(25)' 
for all i such that p does not divide q;, then 

= (pi- N(pi) = 2B; 
but if (25)’ fail for at least one 7, then 

We are now able to apply the Hensel criterion and we have at once 


THEOREM 7. Let K be of degree q™ and type (1, 1, - - - ); and let d(K) be 
divisible by exactly n primes. Let p be a prime <q"; then if p does not divide 
d(K), and 

(i) af (25) hold, p is a common index divisor; 

(ii) if (25) fails for at least one i, then p is a common index divisor only if 


if p |d(K), and 
(iii) if (25)’ hold, p is a common index divisor if 


(iv) if (25)’ fails for at least one i, then p is a common index divisor only if 
pee. 


CAMBRIDGE UNIVERSITY, 
CAMBRIDGE, ENGLAND 


136 
- 
i 


ON THE DERIVATIVES OF NEWTONIAN AND 
LOGARITHMIC POTENTIALS NEAR THE 
ACTING MASSES* 


BY 
MILDRED M. SULLIVAN 


1. Introduction. The existing theorems{ on the continuity of the deriva- 
tives of the potentials of various spreads of acting matter, at points of the 
spreads, exact more of the densities and surfaces involved than is necessary 
for the conclusions drawn. A single exception is the work of Petrini,{t who 
considers necessary and sufficient conditions. The generality of his results, 
and the incident delicacy of the considerations establishing them, have pre- 
vented their becoming widely current. Moreover, they are concerned only 
with the derivatives of the first two orders in the case of potentials of volume 
distributions, and of the first order in the case of surface spreads. 

Recently,§ Professor Kellogg, in a study of the continuity of harmonic 
functions defined by their boundary values, and of their derivatives, has 
shown the usefulness of a simple condition due to Dini.|| The present paper 
consists in a systematic application of the Dini condition to harmonic func- 
tions defined by the densities of the spreads whose potentials they are. Be- 
cause of the immediate availability of the same methods of proof, theorems 
have been added on the effect of Héider conditions on the densities and their 
derivatives in assuring the existence of Hélder conditions on the potentials 
and their derivatives, results which are already at hand only for the deriva- 
tives of the first order, in the work of Schauder (loc. cit.). The treatment is 


* Presented to the Society, October 31, 1931; received by the editors April 20, 1932. 

The author desires to make to Professor O. D. Kellogg grateful acknowledgment for the sug- 
gestion which led to this paper, and for his helpful interest in its development. 

t See, for the earlier literature, L. Lichtenstein, Neuere Entwickelung der Potentialtheorie. Kon- 
forme Abbildung, Encyklopidie der mathematischen Wissenschaften, II C 3, pp. 199-209. See also 
J. Schauder, Potentialtheoretische Untersuchungen, Erste Abhandlung, Mathematische Zeitschrift, 
vol. 33 (1931), pp. 602-640; L. Lichtenstein, Uber einige Hilfssdtze der Potentialtheorie, IV, Sachsische 
Berichte, vol. 82 (1930), pp. 265-344. 

} Les dérivées premiéres et secondes du potentiel, Acta Mathematica, vol. 31 (1908), pp. 127-332. 

§ On the derivatives of harmonic functions on the boundary, these Transactions, vol. 33 (1931), 
pp. 486-510. 

|| Sur la méthode des approximations successives pour les équations aux dérivées partielles du 
deuxiéme ordre, Acta Mathematica, vol. 25 (1902), p. 224. The form of the condition used in the pres- 
ent paper is slightly more general than that employed by Professor Kellogg. 


137 


138 M. M. SULLIVAN [January 


given for space of three dimensions, but it is shown that the results are valid 
also for logarithmic potentials.* 

2. Definitions and lemmas. We introduce the following 

Definition. Let f(g) be defined on the regular surface element Sf, and be 
continuous at the point p of S. Let g vary on S in any fixed plane through the 
normal to S at p. Then f(g) is said to satisfy @ uniform Dini condition at p if 
the integral 


(1) - IOI 55,3 


r 


converges uniformly as to the normal plane chosen. f(q) is said to satisfy a uni- 
form Dini condition on S if it is continuous on S and the convergence of the 
integral (1) is uniform both as to the point p and as to the direction at p. 

It will be convenient to use the following class notations. If S is a regular 
surface element admitting for some orientation of the axes a representation 
¢=(&, ») where $(é, 7) has derivatives of order m which satisfy a uniform 
Dini condition on S we shall say that S is a regular surface element of class 
C»+*, If f(é, n) has derivatives of order m on S which satisfy a uniform Dini 
condition there, we shall say that f(&, 7) is of class C**+* on S. To say that S 
(or f(é, n)) is of class C"* will mean that the derivatives of ) (or 7)) 
of order satisfy a uniform Hélder condition with exponent A on S. 


A region of type V will be understood to mean a closed region of space (con- 
sisting of an open continuum and its boundary), partially bounded by the sur- 
face S under consideration, but containing no boundary points of S, and such 
that the point P can approach S from only one side while remaining in V. 

We shall frequently have to consider the product or quotient of two func- 
tions each of which satisfies a uniform Dini condition on S. We shall need the 
following lemmas. 


Lemma 1. Let f and ¢ be continuous functions satisfying a uniform Dini con- 
dition on S. Then fo satisfies a uniform Dini condition on S, and, provided $ has 
a positive lower bound, so also does f/¢. 


This follows at once from the inequalities 


* In a paper entitled Potential functions on the boundary of their regions of definition, these Trans- 
actions, vol. 9 (1908), pp. 39-50, Professor Kellogg used a condition of the Dini type in a study of 
double distributions in the plane. The present results are more general in that less is required of the 
boundary curve. 

t For definition, see Kellogg, Foundations of Potential Theory, Berlin, 1929, p. 105. 

t It is readily shown on the basis of the regularity of S that, for sufficiently small a, to each value 
of r, OSrSa, there corresponds one and only one point g. 


THE DERIVATIVES OF POTENTIALS 


f | £(g)o(q) — | P 


r 


r 


€ r 


together with similar ones for the quotient. 
In the same way, we establish 


Lemma 2. If S is of class C'+*, sec y=(1+42+¢,?)"? satisfies a uniform 
Dini condition on S. 


Here y denotes the angle between the normal to S at g and the ¢-axis. Asa 
corollary to these lemmas, we note that if S is in C'+® and if f(g) satisfies a 
uniform Dini condition on S, the same is true of f(g) sec y. 

We shall also have need of 

Lemna 3. Let S be a regular surface element, and let r’ denote the projection 
of r= pq on the tangent plane to S at p. Then if either of the integrals 


f | f(g) — | dr’ | — | 
0 4 : 0 r 


r 


converges uniformly as to any parameters, so does the other. 


This follows from the facts, first that r’Sr, and secondly that r<arc 
Pq <r’ max |sec y |, the arc being in the plane through # and q containing the 
normal to S at p. The existence of the maximum of sec y is implied in the 
definition of regular surface element. Thus the above integrals may be used 
interchangeably in Dini conditions. 

It is important to notice the following. 


Lemma 4. The definition of the classes C"** and C* is independent of the 
coérdinate system. That is, if S is in either class for one system of orthogonal 
axes, it is in the same class for any other system of orthogonal axes in which the 
angle between the tangent plane to the surface and the §-axis has a positive lower 
bound. 

If x, y, are the codrdinates referred to the given axes and é, n, ¢ the co- 
ordinates referred to any other axes with the specified properties, we have 
on S 

my + mf(x, y), 1 = lx + + mof(x, 
m3y + Nsf(x, 


1933] 139 


140 M. M. SULLIVAN [January 


where z=/(x, y) is the given representation of S. These equations define a 
representation {= (£, 7) of S with respect to the second system of axes. We 
find, then, 


Life + mfy — m 


lsfz + msfy — Ms 


The denominator is the cosine of the angle between the normal to S and the 
¢-axis multiplied by (1+/.?+-/,7)"*. It has, therefore, a positive lower bound. 
Then, if S is of class C!+* in the given system of coérdinates, it follows that 
this derivative satisfies a uniform Dini condition on S. The same considera- 
tions apply to the other derivative of the first order of ¢. Therefore, S is of 
class C!+* in the second system of coérdinates. Similarly, if S is of class C!+ in 
the given system of codrdinates, the derivatives of the first order of £ satisfy 
a uniform Hélder condition on S. 

In the same way, Lemma 4 can be extended to derivatives of any order. 
Thus the concept of the class of a regular surface element S is independent of 
the codrdinate system provided no normal to the surface is at right angles to 
the ¢-axis. 

If S is a regular surface element there exists a positive constant ¢c such that 
if ¢ is the portion of S in a sphere of radius c about any interior point of S, 
then o has a standard representation with tangent-normal axes at any point 


p of o.* Applying Lemma 4, we see that if S is of class C*+*, o is of class 
C+? with respect to a tangent-normal system of axes at any point ? of c. 

In our theorems on the derivatives of the first order of the potential of a 
simple distribution we need the following. 

Lemna 5. If S is a regular surface element of class C'+* having a standard 
representation with respect to a tangent-normal system of axes at any interior 
point p given by § = $(E, n), the integral 


where a’ is a circle about # in the (£, )-plane and 7’ is the projection of r on 
this plane, converges at p, and the convergence is uniform as to p. 


We cut out from a’ a circle c of radius e. Then 


2r 
ff Hi as' = f 
ton 0 


* See Kellogg, Foundations of Potential Theory, loc. cit., p. 157. 


1933] THE DERIVATIVES OF POTENTIALS 141 


since the integrand is continuous in o’—c. This equality holds for any e>0. 
If the inner integral on the right converges uniformly as to 6 when e—0, the 
iterated integral approaches a limit, which limit is the value of the improper 
double integral over a’, as well as the value of the iterated integral in which 
r’ goes from 0 to a. 

We shall study, then, 


f dr’. 


If we write cos 0, =ésin 8, 


= ¢(0, 0) + "Sa 
f=¢ dt 


=0+ f cos 0, t sin 8) cos + ¢,(¢ cos 8, sin 8) sin 
0 


Hence, 


rls +1 ja 


+ J)dt, 


where 


¢ ’ cos 6, r’ sin 6 ¢ ’ cos 6, r’ sin 6 
0 0 


r’ r’ 


By hypothesis, J and J vanish with ¢ uniformly as to @. Therefore, 


Integrating by parts, we have for the integral on the right 

€ 0 at a 0 dt 


The last term vanishes with e. This is true of the first term, also; since, by the 
law of the mean, 


M. M. SULLIVAN 


+ Dat = 
€ 0 dt a € 0 dt 


i 
= (I + J) tme 
€ 
where 0 <i/e<1. Therefore, 


el¢| 1 ad 
fore f t—(I + J)at 
0 ado dt 


(I + 


since the derivatives of J and J with respect to ¢ are never negative. It follows 


therefore, not only that 
ff, 
a’ 


converges everywhere on S, but that this integral approaches 0 with the 
radius of o’, uniformly. 

3. Existence and continuity of the derivatives of the first order of the po- 
tential of a simple distribution. 

Tangential derivatives. We prove the following theorem: 


THEOREM I. Let S be a regular surface element of class C'+*. Let a, the den- 
sity of a simple distribution on S, satisfy a uniform Dini condition at the interior 
point p of S. Then the derivative of the potential U at P in the direction of any 
tangent at p approaches a limit as P approaches p along the normal. If o satis- 
fies a uniform Dini condition over a closed portion of S containing no boundary 
points of S, the limits of such derivatives are approached uniformly as to p on 
such a portion. 


We shall restrict ourselves to the portion of S contained in a sphere of 
radius c about ~, such that we have a single representation of the whole 
piece with a tangent-normal system of axes at any point of the piece. We have 
seen that such a positive constant c exists uniformly as to p. As the potential 
of the rest of S is analytic in a neighborhood of », we may neglect it and as- 
sume that all of S is contained in this sphere. 

We take a tangent-normal system of axes at ~, choosing the x-axis in the 
direction in which we are taking the derivative and the y-axis in the perpen- 
dicular tangential direction. Let P be a point on the z-axis. Then for 20, 


aU 


142 [January 


1933] THE DERIVATIVES OF POTENTIALS 143 


S’ being the projection of S on the (x, y)-plane and ¥ the angle between the 
normal to S and the z-axis. This derivative may be written 


r= ff sec y dS’, r= ff sec ds’, 
o r S’—o' r 


o’ being a small circle of radius a about in the (x, y)-plane. Then, for any 
fixed o’, J is continuous, and corresponding to any e>0, there is a 5 depend- 
ing on € and a’ only, such that 


| J(z2) — J(z:)| < €/3 when 0 <2, <6, 0 <a <5. 
We write 


<a, n= f ser - «lias, 


and compare J, with the integral 


where 


o(p) ff as’ where p? = + 7? + 


This integral vanishes since the integrand has equal and opposite values at 
(é, n) and (—, Hence, 


— 1 1 
- 
Since 


sr = <0, |2-¢| Sr, 


it follows that 
| < 3max|o| ff 181 


= I+J 
ax 
where 
| 


144 M. M. SULLIVAN [January 


By Lemma 5, the last integral converges. It follows that we can so choose o’ 


that 
, 1 


unless max |¢|=0 (in which case J;=0). Therefore, by choosing a small 
enough we can make |J,|<e/6, independently of z. Then 


| Z1(z2) — | < ¢/3. 


As to the second integral, we have 


ff | o(@) sec y ~ | 


o(q) secy — o(p)| 
sf f 7 dr'd@. 


By Lemmas 1 and 2, the inner integral converges at p, uniformly as to @. 
Therefore, by taking a small enough, a depending on # in this case, we can 
make 


dr’ < €/(127), 


| sec y — 
0 
independently of 6. Then we shail have 

| I2(z2) — I2(z1) | < €/3. 
Having chosen a we can, as we have seen, by taking 0 <z, <6, 0<2,.<6, make 

| J (z2) — J(z1) | < ¢/3. 
Therefore, for z; and 2: so restricted, 

OU (ze) 0 U(z,) 

Ox Ox 


and the derivative approaches a limit as P approaches p along the normal 
at p. 

The only inequality which depends on the point # is that for 2. If o satis- 
fies a uniform Dini condition on a closed portion of S this inequality can be 
made independent of the position of ~ on this portion of S. Therefore, under 
these conditions, the approach of the derivative to its limit along the normal 
will be uniform on this part of S, both as to the point /, and as to the tan- 
gential direction of differentiation. 


1933] THE DERIVATIVES OF POTENTIALS 145 


Normal derivatives. In studying the normal derivative at P as P ap- 
proaches ~ along the normal, we need assume only that o is bounded and 
integrable on S and continuous at p. We assume that S, as in the case of tan- 
gential derivatives, is a regular surface element of class C'+?. 

When z+0 the normal derivative at P(0, 0, z) is given by 


oU 
(2) — = ff- 
Oz s r 


where S’ is the projection of S on the (x, y)-plane. Let U’ be the potential of a 
plane lamina occupying the area S’ and having the density o sec y. Then 
(3) —-=- ff secy= as" where p= 2+ 7? + 27. 
p® 


This derivative, as is well known, approaches +270(p) according as P ap- 
proaches # along the positive or the negative z-axis. 
We consider the difference 


p 


1 


The integral J, is continuous in z for fixed a’. As to J;, if we use the algebraic 
identity already employed in connection with the tangential derivatives, and 
the inequalities 


|z| Sp, Srt+op, r S47, Sp, 


we find that 
| 4 max | 


Thus, by Lemma 5, J; can be made arbitrarily small by sufficiently restricting 


| 
Oz Oz 
=I,;+ 
where 
h -f 
-f dS’. 


146 M. M. SULLIVAN . [January 


We conclude that the difference (4) is continuous at z=0. The value of 
this difference at p is given by the convergent integrals 


Using the values of the limits of the derivative (3), we have as limits of the 
derivative (2) when P approaches / from the positive or negative side 


These limits are approached uniformly as to p on any closed interior portion of 
S on which @ is continuous. 

Letting m denote the direction of the normal in the positive sense to S at p, 
we may state the results obtained, in 


TueorEM II. Let S be a regular surface element of class C'+*. Let a, the den- 
sity of a simple distribution on S, be continuous at p. Then the normal derivative 
of the potential of this distribution approaches a limit as P approaches p along 
the normal to S at p from either side and these limits are 


(| —) as, 


= + 2o(p) + as. 


These limits are approached uniformly as to p on any closed portion of S, con- 
taining no boundary points of S, on which o is continuous. 


Derivatives in any direction. It follows from Theorems I and II, that if S 
is a regular surface element of class C'+® and @ satisfies a uniform Dini con- 
dition at p, the derivative in any direction approaches a limit as P approaches 
p along the normal at p. If o satisfies a uniform Dini condition on a closed in- 
terior portion of S, the derivative of U in a fixed direction approaches its 
limits uniformly along the normals on this portion of S. We shall now prove 


THEOREM III. Let S be a regular surface element of class C'+*. Let o satisfy 
a uniform Dini condition on S. Then the potential U of the distribution of den- 
sity ¢ on S has derivatives of the first order which, when defined on S by their 
limits, are continuous in any region of type V. 


The difficulty of the situation arises from the fact that, in the absence of 
hypotheses assuring curvatures of S, we cannot count on the existence of a 
field of normals to S. The following lemma, however, will be all that is needed: 


1933] THE DERIVATIVES OF POTENTIALS 147 


Let Q be a point on the normal to S at p, and let o denote a sphere about Q. Then 
there exists a sphere o' about p, such that every point P of a’ lies on a normal to 
S which meets o. 

Let a denote the radius of a sphere a’ about ». We shall show that a can 
be chosen small enough to meet the requirements of the lemma. As a first 
restriction, 2a is to be less than the distance from the interior point p to the 
boundary of S. Let P be a point of o’. The largest sphere about P which con- 
tains in its interior no points of S, has on its surface at least one interior point 
of S. Let g be such a point. Then Pq, being a minimal segment from P to S, is 
normal to S at g. Moreover, Pg S$ Pp <a, and hence pq < 2a. Furthermore, be- 
cause of the continuity of the direction cosines of the normal to S, the nor- 
mals to S at points in o’ make with the normal at #, angles having a maximum 
n, which approaches 0 with a. It follows that the greatest distance from Q to 
the normal Pg does not exceed 2a+ 0 sin n. As n—0 with a, a can be chosen 
positive and so small that the normal Pq will certainly meet c. 

Turning now to the proof of the theorem, we consider the continuity of 
any derivative of U at an interior point p of S. Let F(P) denote the derivative 
in question. F(P) approaches its limits along normals, uniformly. There thus 
corresponds to any e>0, a one-sided neighborhood WN of #, in which F(P) 
differs from its limiting values on normals by less than ¢/4. Let Q be a point 
of the normal at » in NV. Then since F(P) is continuous at Q, there is a sphere 
o about Q and in N, such that if P’ is ino, 


| F(P’) — F@)| <«/4. 


If P is any point in the sphere o’, which corresponds, by the lemma, to o, we 
may take P’ as a point of the normal Pg. We then have the following in- 
equalities, resulting from the uniform approach along the normals: 


| F(P’) — F(g)| < 

| F(P) — F(q)| < «/4, 

| FQ) — F(p)| < 
It follows from these that if P is any point in a’, 

| F(P) — F(p)| <«, 
and the derivative is thus continuous at p. The rest of the theorem follows 
from the analytic character of the potential at points not on S. 

4. Existence and continuity of the potential of a double distribution at 

points of the distribution. We shall consider the potential of a double distri- 


bution on a regular surface element S of class C!+*. The moment yp is to be 
bounded and integrable. The potential is given by the integral 


M. M. SULLIVAN 


Ov r 


For the sake of completeness, we show first that this improper integral con- 
verges on S under the present assumptions. Using as the field of integration 
the projection of S on the tangent plane at p, this amounts to verifying that 


tim ff lim f f 
sie’ OV tae! 


exists, o’ being a circle about ~, whose radius tends to 0. But since yu is 


bounded and integrable, 
lim f f a dS’ 


exists, by Lemma 5. Moreover, 


where e is the radius of o’, and a the maximum radius of S’. As the integral on 
the right converges, because S is in C'+*, the rest of the integral under dis- 
cussion converges absolutely. Thus the integral for U has a meaning at points 
of S. We denote its value by Uo. 

If we introduce cosa, cos, cosy, the direction cosines of the normal to S$ 
at g(é, n, ¢), U takes the form 


U=- — —dS— — —dS — fi 


We suppose S is referred to tangent-normal axes at p. The first two terms on 
the right then represent tangential derivatives of simple distributions on S, 
while the third is the normal derivative of such a distribution. If u is contin- 
uous at ~, uw cosa and yu cosf satisfy a uniform Dini condition there, as is 
readily verified, and wu cosy is continuous. From these facts and from The- 
orems I and II we obtain 


THEOREM IV. Let S be a regular surface element of class C'+*. Let pw, the 
moment of a double distribution on S, be bounded and integrable on S and con- 
tinuous at the interior point p of S. Then, as P approaches p along the normal to 
S at p from either side, the potential U of the double distribution approaches 
limits given by 


148 [January 
a1 
U = f f 


THE DERIVATIVES OF POTENTIALS 
U, = 2au(p) + Us, 
U_ = — 2xp(p) + Uo. 


On any closed portion of S containing no boundary points of S, on which pis con- 
tinuous, these limits are approached uniformly. 


By the reasoning used in proving Theorem III we have 

THEOREM V. Let S be a regular surface element of class C'+*. Let pw, the 
moment of a double distribution on S, be continuous on S. Then the potential U 
of the distribution, when defined on S by its limits, is continuous in any region of 
type V. 

5. Korn’s identities. In studying the derivatives of order m of the poten- 


tials of simple and double distributions we shall use two identities due to 
Korn.* We shall state these identities in the following lemmas. 


Lemma 6. If S is a regular surface element of class C’’ and o is continuously 
differentiable on S, the identity 


= f cos (s, ¢) — cos cos (5,7) 
Ox r 


r 


1 
+ + my + m) + od + + }as 


+ f —as 


(where I is the boundary of S, and /, m, n are the direction cosines of the nor- 
mal at g) holds for points not on S. 


Lema 7. If S is a regular surface element and yu is continuously differen- 
tiable on S, the aca 
: f fos og (s, cos (r, — cos (s, n) cos (r, 


Ov r r r? 


a 1 a 1 a 1 
+— J. + — f luy—dS + — f luz—dS 
Ox OyJJs OzJJs 


+ [fuss 


(where I is the boundary of S, and /, m, n are the direction cosines of the nor- 
mal at g) holds for points not on S. 


* A. Korn, Potentialtheorie, vol. I, Berlin, 1899, pp. 36-38, pp. 40-42. 


150 M. M. SULLIVAN [January 


6. Existence and continuity of the derivatives of order 1 of the potentials 
of simple and double distributions. We have already obtained, in Theorem 
III, sufficient conditions that the derivatives of the first order of a simple 
distribution on S be continuous when defined on S by their limiting values. 
Using Theorems III and V and Lemma 7, we find that if S is a regular surface 
element of class C'** and yp is of class C'+* on S, the potential of the double 
distribution of moment yu on S is continuously differentiable in a region of 
type V. 

We shall now prove the general theorems for derivatives of order n. 


THEOREM VI. Let S be a regular surface element of class C"+*. Let o, the 
density of a simple distribution on S, be of class C~-'**. Then U, the potential of 
this distribution, has continuous derivatives of order n in any region of type V 
when they are defined on S by their limits. 

THEOREM VII. Let S be a regular surface element of class C"*+*. Let w, the 
moment of a double distribution on S, be of class C*+*. Then U, the potential of 
this distribution, has continuous derivatives of order n in any region of type V 
when they are defined on S by their limits. 

We prove these theorems, already established for »=1, by induction. We 
assume that both theorems are true when 1 is replaced by n—1. 

Let S have the standard representation ¢=¢(, 7) with respect to the 
(E, n, ¢)-axes. Then ¢ is of class C"**. For the potential of Theorem VI, by 
Lemma 6, we have 

J cos (v, n) cos (s, £) — cos (v, £) cos (s, 7) 
= f ds 


r 


1 
+ J — + + of + oym]}— as 


+ ffi 


with similar expressions for 0U/dy and 0U/dz. 

The first term in (5) is analytic at all points not on I’. Therefore, this term 
has continuous derivatives of all orders in the closed region V. The second 
term is the potential of a simple distribution on S with density of class 
C»-*+5, By our assumption that the theorem is true when is replaced by 
n—1, this term has continuous derivatives of order »—1 in V when they are 
defined on S by their limits. The same is true of the third term, since it is the 
potential of a double distribution on S with moment of class C*~!+*, There- 
fore, the whole expression (5) has continuous derivatives of order n»—1 in V 


1933] THE DERIVATIVES OF POTENTIALS 151 


when they are defined on $ by their limits. The expressions for the other two 
derivatives of U can be discussed in a similar manner, with the result that 
they also have continuous derivatives of order »—1 in V. Therefore, U has 
continuous derivatives of order m in V when they are defined on S by their 
limiting values. 

Turning to the proof of Theorem VII, we study the potential 


for which, by Lemma 7, 


f cos (s, ¢) cos (r, ) — cos (s, ) cos (r, D | 


with similar expressions for 9U/dy and 0U/dz. 

The reasoning employed for Theorem VI now yields the desired result. 

We have proved then, that if the theorems are true for derivatives of order 
n—1, they are true for derivatives of order ». We know that they hold for 
n=1. Therefore, they hold for any positive integral m. 

7. On the scope of the sufficient conditions established above. We have 
just found sufficient conditions that surface potentials have continuous 
derivatives of order n. It is natural to ask if we can place lighter conditions 
upon the spreads without impairing the conclusions of Theorems VI and VII. 
If we can produce examples satisfying slightly lighter hypotheses and show 
that the derivatives of order m do not exist or are not continuous, it follows 
that these lighter hypotheses cannot furnish sufficient conditions for the ex- 
istence and continuity of derivatives of order n. 

Let us consider the derivatives of order m of a simple distribution. Is it 
possible to get a set of conditions for the existence and continuity of these 
derivatives in which we require of o only that it be of class C*~!? 

We let S be a circular lamina of radius a<1. The plane of S will be the 
(é, n)-plane. We write =r’ cos 0, y=r’ sin 6. We consider the densities 

cos 0 
» if m is odd, o = — » if m is even, 
log r’ log r’ 
where r’ is measured from the center of S. We see that o has continuous 
derivatives of order »—1 on S and that these derivatives do not satisfy a 
Dini condition at r’=0. We conside* 


u=f 
r | 
= 

Ox 

(6) | 


M. M. SULLIVAN 


o"U 

Ox” |p 
where P is the point (0, 0, z), 0, that is, a point on the normal to S at the 
center p. We shall prove that this derivative becomes infinite as P approaches 


p. 
In fact, 


-f fx 
Ox" |p Ox" 


1 CrP 
Os* r r 


where C,,>0, and where P,,(u) is the Legendre polynomial of order m. Then 
P, 
=(- f fo 
Ox" |p 


- 


If we break S into the circle of radius az and the remaining annular region, 
where a is a suitably chosen constant and az <a, it can be shown that, when 
the density o is the function given above, the integral over the first of these 
regions approaches 0 with z while the integral over the second region becomes 
infinite. Hence, the derivative in question becomes infinite as P approaches p. 

Therefore, o being of class C*~! does not insure the existence of the deriva- 
tives of order m no matter how smooth the surface S. 

By similar reasoning it can be shown that S being of class C* does not 
insure the existence of the derivatives of order n of a simple distribution on S 
no matter how smooth the density. In proving this we consider the distribu- 
tion of unit density on the surface given by 

r’™ cos 0 


= » if nm is odd, § = ————-» if nm is even, 
log r’ log r’ 


where 


Turning now to double distributions, it can be shown, by considering the 
distribution on a circular lamina with moment yp defined by 


152 [January 
and 


THE DERIVATIVES OF POTENTIALS 


=— » if m is odd, wp = — » if m is even, 
log r’ log r’ 
that u being of class C* does not insure the existence of the derivatives of order 
n of the potential of a double distribution on S no matter how smooth S. 
Using a distribution of moment é on the surface given by 


r’™ cos 
= » if m is odd, § = ———— 
log r’ log r 


—» if m is even, 


it can be shown that S being of class C* does not insure the existence of the 
derivatives of order u of the potential of a double distribution on S, no matter 
how smooth the moment. 

It thus appears unlikely that materially lighter conditions exist which are 
at once simple and sufficient for the existence and continuity of the deriva- 
tives. 

8. Holder conditions on the derivatives of surface distributions. It has 
been proved* that if S is of class C'+ and a is of class C’, the derivatives of 
the first order of the simple distribution of density ¢ on S satisfy a uniform 
Hiélder condition, in a closed region of type V, with exponent x, where 0<« 
Si, xSX’. If Sis of class C'* and yp, the moment of a double distribution on 
S, is of class C1’, the derivatives of the first order of the potential of this 
distribution satisfy a uniform Hélder condition in a region of type V with ex- 
ponent x. We shall now establish general theorems for Hélder conditions on 
the derivatives of order . 


THEOREM VIII. Let S be a regular surface element of class C+». Let a, the 
density of a simple distribution on S, be of class C™-!*»’. Then the derivatives of 
order n of the potential of this distribution when defined on S by their limiting 
values satisfy a uniform Holder condition in any region of type V with exponent 
x, 


THEOREM IX. Let S be a regular surface element of class C"+*. Let wu, the 
moment of a double distribution on S, be of class C"*’. Then the derivatives of 
order n of the potential of this distribution when defined on S by their limiting 
values satisfy a uniform Holder condition in any region of type V with exponent 
k, where0<xKSd, 


* See Schauder, loc. cit. Although in the case of the derivatives of the first order, the present 
methods do not yield more general results than those of Schauder, in the case of the potential of a 
double distribution itself, they yield a lighter condition on the surface than that employed by him. 
See Theorem X, p. 156. 


1933] | 153 


154 M. M. SULLIVAN ; [January 


As in an earlier instance, we use the method of induction, assuming that 
both theorems are true when is replaced by »—1. Then, turning first to the 
proof of Theorem VIII for =n, we consider the identity (5) for dU/dx. 

The first term is analytic at all points not on I, and so in V. The second 
term is the potential of a simple distribution on S with density of class 
C-*+«, Therefore, this potential has continuous derivatives of order n—1 
which satisfy a uniform Hélder condition in V with exponent x. The third 
term is the potential of a double distribution on S with moment of class 
C”-'+«, Therefore, the derivatives of order n—1 of this potential satisfy a 
uniform Hdélder condition in V with exponent x. Hence, the derivative of U 
with respect to x has continuous derivatives of order »—1 which satisfy a 
uniform Hélder condition in V with exponent x. The same is true of the other 
two derivatives of U. Therefore, the derivatives of U of order n satisfy a uni- 
form Hélder condition in V with exponent x. 

Similar reasoning applied to (6) yields the desired result in the case of 
Theorem IX. 

Since both theorems have been established for n =1, it follows that they 
hold for any positive integral . 

9. Hélder conditions on the potentials of surface distributions. We have 
been concerned with the existence of Hélder conditions on the derivatives of 
potentials of simple and double distributions on regular surfaces. We now 
take up the question of such conditions on the potentials themselves. 

For a simple distribution, it is known that a distribution with bounded in- 
tegrable density o, on a regular surface element S, has a potential which 
satisfies a uniform Hélder condition with any given exponent less than 1. In 
fact, the present methods yield the result for such a potential 


Re® 
| Us — U;| S 2emax| sec y| riz log SR, 
Ti2 


where R is the maximum chord of S. 
We therefore turn at once to the potential of a double distribution, estab- 
lishing first 


Lemna 8. Let S be a regular surface element of class C1+*. Then the integral 
(7) 


is bounded in any region of type V.* 


* Schauder (loc. cit.) has proved a similar lemma, less general than this, in that S is required to 
be of class 


1933] THE DERIVATIVES OF POTENTIALS 155 


As we have seen in connection with the proof of Theorem III, there is a 
neighborhood WN of any interior closed portion of S with the property that 
through any point of N there passes a normal to S at an interior point of S. 
This neighborhood can be so chosen as to include all points of S in V. The 
points of V not in NW are distant at least 6 from S, 6 being a positive constant. 
At such points the integral (7) is bounded. In fact 


1 
— f dS 
6? s 

is such a bound. 


We may assume then that the parameter point P in (7) is on the normal to 
S at the interior point ». We take a tangent-normal system of axes at p and 
denote by o a portion of S having a standard representation with these axes. 
This representation exists uniformly as to p. Then 


is bounded uniformly as to p. 
For the rest of the integral, we have 


Ov r 


Shtint ds, 


1 
ff 

r 


a’ being the projection of o on the tangent plane at ~. Then, since 


where © is the solid angle subtended at P(0, 0, z) by the flat element of sur- 
face o’, 


Also, 


where 

1 

dS’, 
p 
< 2r. 
| 


M. M. SULLIVAN 


nsaff has. 
a’ 


By Lemma 5, this integral is bounded and the bound can be taken independ- 
ent of p. The integral J; is bounded, as we have noted at the beginning of §4. 


Therefore, 
01 
f f 
|\Ov r 


is bounded uniformly as to #. It follows that (7) is bounded in V. 
We now prove 


THEOREM X. Let S be a regular surface element of class C'+*. Let yw, the 
moment of a double distribution on S, be of class C*. Then the potential of this 
distribution, when defined on S by its limits, satisfies a uniform Holder condition 
with exponent d in any region of type V. 


As in Lemma 8, we shall restrict ourselves to points of V in the neighbor- 
hood N, since the potential is analytic in the rest of V. We let P; be a point of 
V on the normal to S at the interior point ~;. We shall now assume that all of 
S is given by the portion contained in a sphere of radius c about /, and having 
a standard representation with tangent-normal axes at i, since the potential 
of the rest of S will be analytic at P;. Such a representation exists uniformly 


as to pi. We let P: be a second point of V so restricted that the parallel 
through P, to Pp; meets S in the interior point po. 
We write 


U=U0,+4+ U2 


where 


Since U; is the potential of a double distribution of constant moment on S, it 
has continuous derivatives of the first order in V, by Theorem VII. 

We have then only to prove that U; satisfies a uniform Hélder condition 
with exponent A. We denote by oa the portion of S in a circular cylinder of 
radius 2712 whose axis is the normal to S at pi, ri2 being the distance P,P2, and 
write 


U2(P2) — U2(P1) = + Aa, 


where 


156 [January 


THE DERIVATIVES OF POTENTIALS 


ff wo - = as, 


the subscripts indicating that the codrdinates of P; and P» are to be substi- 
tuted for x, y, and z. Then 


imax] aio — { av re as+ a» st. 


Since the integral in the above inequality is bounded in V, by Lemma 8, and 
since py satisfies a uniform Hélder condition on S, it follows that 


| Kira, 


K;, being an appropriate constant. 
The remaining difference, A, may be written in the form 


8s 1 


where the integration with respect to s is from P; to P2 along the segment 
joining them, and where the integrand is continuous in the field of integra- 
tion. From this it follows that 


1 
| As| < 10(1 + 2M)A f f f ri — dSds, 
r 


where M is the maximum of |¢,| and |¢,| and A is the coefficient in the 
Holder condition on yw. Here r is measured from a point on P;P2, and 7; from 
p:. If we denote by rj and r’ the projections on the tangent plane at ,; of n 


and r respectively, we have the inequalities 
ry S maxsecyri, r’ S47, ri S 2r’. 


Hence, 


1933] 157 


M. M. SULLIVAN [January 


85 4\ 3 
< 10(1 + 2M)A (max sec ff dSds 
Tr 


<= 80(1 + 2M)A (max sec f ff 
81 


2r a 
80(1 + 2M)A (max sec > f f f ri>—*dr'd6ds 
0 2rie 
160(1 + 2M)xr 
— 


Korie. 


A (max sec 


Combining the inequalities for A; and As, we see that U2 satisfies a uni- 
form Hélder condition with exponent A in V. It follows that the same is true 
of U. 

10. The existence and continuity of derivatives of volume potentials. The 
derivatives of the first order of the potential due to a volume distribution of 
bounded and integrable density in a bounded volume are continuous through- 
out space as is well known. As a preliminary to the study of the derivatives 
of higher order, we define the Dini condition in a region of space. 

Definition. Let {(Q) be defined in the closed region V, and continuous at 
the point P of V. Let Q vary along a ray through P. Then if the integral 


(8) | — f(P)| 


r 


converges at P uniformly as to the direction of the ray chosen, f(Q) is said to 
satisfy a uniform Dini condition at P. If f(Q) is continuous in V and if the in- 
tegral (8) converges at every point of V uniformly both as to P and as to the 
direction of the integration at P, f(Q) is said to satisfy a uniform Dini con- 
dition in V. 

In case P is a boundary point of V, a given ray may contain both points of 
V and points not in V. In this case, the above integral is to be understood as 
the Lebesgue integral over the closed set of points of the interval (0, a), 
of the ray in V. Since the integrand is non-negative, the convergence of the 
integral is equivalent to the summability of the integrand, and the uniform- 
ity of the convergence means that the integral vanishes with a, uniformly 
as to the direction of the ray. The condition at a boundary point is evidently 
fulfilled if it is possible so to extend the definition of {(Q) to a neighborhood of 
the boundary point that it is fulfilled by the extended function. 

We shall adopt for functions defined in a region of space the class notations 
given in §2 for a function defined on a surface. 


158 
| 


1933] THE DERIVATIVES OF POTENTIALS 


Our object is the establishment of the following theorem. 


THEOREM XI. Let V be a regular closed region of space whose boundary S 
contains a regular surface element = of class C"~'**, and let x, the density of a 
distribution in V, be of class C"-*+*, n=2. If n>2 let x be of class C™-*** on = 
as well as in V.* Let V' be a closed region of space partly bounded by 2, but con- 
taining no boundary points of = or other points of S. Then the potential of this 
distribution has continuous derivatives of order n in V’ when they are defined on 
= by their limits. 


We note that the region V’, except for boundary points in 2, may be en- 
tirely interior to V, or entirely exterior to V. The two cases require different 
treatment and will therefore be taken up separately. We remark that we may 
once and for all confine ourselves to the portion of V inside a certain sphere 
which cuts off from = a portion having a standard representation, ¢ = ¢(é,7), 
where ¢(é, 7) is in class C™-!+*, when referred to axes tangent and normal to 
> at the center of the sphere. For later purposes we note that a single radius 
will serve for this sphere for all points on any closed portion of 2 containing 
no boundary points of 2. The justification of confining ourselves to such a 
portion of V lies first in the fact that the distribution on the rest of V is 
analytic at the interior points of the sphere, and secondly that exactly the 
same methods as those here used may be applied to points of V’ farther 
from 2. 

We shall consider then a region V bounded by a spherical surface and a 
portion of >. We prove the theorem first for »=2 and then extend it by in- 
duction. We may confine ourselves to one derivative of the second order, 
since the others may be treated by exactly the same methods. 

Considering first the case where V’ is in V, we take a point P; in V’, and 
denote by x; the density at P;. We break up the integral into two parts, 


1 
U fffe—av = U2, 
v7 


* A function defined in a volume may satisfy a uniform Dini condition at a boundary point of 
the volume, and yet fail to satisfy a Dini condition, as defined for functions of position on a surface, 
at this point of the bounding surface. For example, the values on the surface of the function 


where 


1 
= #20, 
log log — log — 
= 0, ts 0, 
considered in the volume bounded by {=r”, r’=a, a<1, and ¢= —1, where r?= #-+-7n?, fail to satisfy 
a Dini condition at the origin, although the integral taken along a ray converges uniformly‘there. 


159 


M. M. SULLIVAN (January 


The derivative of the first may be er with the help of Green’s Theorem, 


— — Os 
“Os vy r “Os r 


The differentiation of U2, at the point P; may in the present case be carried 
out under the integral sign because of the sufficiently rapid vanishing of 
(k—:).* We may therefore write 


the subscript indicating that the coérdinates of the corresponding point are 
to be substituted for x, y, and z. 
Let P: be a second point near P;. We write 
0?U 


= Ai + As, 
Ox? |e Ox? 


The integral in A; is the potential of a simple spread on S whose density 
cos(v, x) is in C* on 2, and hence, by Theorem III, it has continuous deriva- 
tives in V’. Since x is also continuous, there corresponds to any e>0 a 
5’>0 depending on ¢ alone, such that for 6’, 


| Ai| < e/4. 


In the discussion of A,, we introduce a sphere a, of radius a about Pi, and 
impose a second restriction P:P:<a/2, on P:. Then, because of the Dini con- 
dition on x, uniform in V, and the fact that 


it is possible to restrict a so that 


* Petrini, loc. cit., p. 135. 


160 
where 

| m@ i | 2 
— —| 
Ox? +r 


THE DERIVATIVES OF POTENTIALS 


independently of the position of P: in the sphere Pi:P:<a/2. Thus if we write 
As = + 
in the first term of which the field of integration is ¢, we have 
| An | < €/2. 


The second term, As», is the difference in values at P, and P, of an integral 
whose integrand is continuous in the variables of integration and the coérdi- 
nates of the parameter point for P:P,<a/2. Hence, if a is fixed, there exists a 
5’’>0 such that if P:P2<6”, 


| Ase | 


Thus, if @ is a fixed positive number, such that |A»|<¢/2, which we have 
seen is possible, independently of P2, and if 6 is the least of the positive num- 
bers 6’, a/2, 6’’, then for P:P2<6, 


0?U 0?U | 


wt <|Ai| +| +] <e. 
Accordingly, the continuity of this derivative in V’ is established in the case 
in which V’ isin V. 

Suppose now that V’ is exterior to V except for common boundary 
points which are interior points of =. We shall show that the derivative in 
question approaches limits uniformly along normals to = on any closed por- 
tion of 2 including no boundary points of 2. The reasoning of Theorem III 
will then establish the continuity of this derivative in V’. 

Let p be an interior point of 2, and let P be a point in V’ on the normal to 
> at p. If the (x, y)-plane is chosen tangent to = at p, with the positive z-axis 
through P, the value of the derivative at P is given by 


(10) = Sff (x — + cos (v, 4s 


where xo is the density at ». Here, the second term is continuous in V’ as we 
have already seen, and so approaches a limit as P approaches p along the 
normal, uniformly in V’. The first term we write as J,+J2 where 


1933] 161 

| 


M. M. SULLIVAN [January 


1 1 
n= fff I,= fff « 


o being the portion of V in a sphere of radius a about ?. A first restriction on a 
is that it be less than the distance from p to the boundary of 2. For the first 
integral, we have 


where 7 is measured from # to the point of integration, and r is measured 
from P. The ratio r./r will be greatest when r is the projection of ro on a plane 
parallel to the (x, y)-plane, which may occur for points of o with positive 
z-coérdinates. However, will never exceed k=max sec y, where is the 
angle between two normals to = at points of ¢. This number & is independent 
of p. We have then 


which is convergent, and vanishes uniformly with a, because x is in C®. 


Accordingly, a can be chosen so that the oscillation of J, can be made 
arbitrarily small independently of the position of P on the normal. Then, with 
a fixed, J, is continuous at p. Hence, the derivative approaches a limit along 
the normal. As the inequalities involved can be made independent of p, the 
approach is uniform in V’. 

Theorem XI is thus established for m = 2. In extending the proof, we make 
use of Green’s Theorem to write the derivative of the potential 


in the form 
1 1 
(11) = f fxcos @, + ff fn—ar, 
Ox 


valid for points P(x, y, z) not on S for x in C’. We assume that Theorem XI 
has been established for derivatives of order n—1. The first term in (11) is 
the potential of a simple distribution on S with density of class C*-?+* on 2. 
By Theorem VI, this potential has continuous derivatives of order »—1 in 


162 


1933] THE DERIVATIVES OF POTENTIALS 163 


V’. By our assumption that the theorem holds when m—1 is substituted for 
n, the second term has continuous derivatives of order »—1 in V’. Therefore 
0U/dx has continuous derivatives of order n—1 in V’. The same applies to 
dU/dy and 8U/dz, so that all the derivatives of U with respect to x, y, and 
2, of order m, exist and are continuous in V’. 

Since the theorem is true for derivatives of the second order it follows that 
it is true for derivatives of any higher order. 

11. Hélder conditions on the derivatives of volume potentials. The po- 
tential of a distribution of bounded and integrable density in a bounded 
volume V has continuous derivatives of the first order which satisfy a uni- 
form Hélder condition with exponent A, where \ is any number such that 
0 <\ <1.* Considering now the derivatives of higher order, we shall prove 


THEOREM XII. Let V be a regular closed region of space whose boundary S 
contains a regular surface element = of class C™-'+, and let x, the density of a 
distribution in V, be of class C»-*+’, Let V’ be a closed region of space partly 
bounded by = but containing no boundary points of = or other points of S. Then 
the derivatives of order n of the potential of this distribution, when defined on = 
by their limiting values, satisfy a uniform Holder condition in V’ with exponent 
d’’, whereO<d"’ Sh, SN’. 

The proof is analogous to that of Theorem XI. Our induction begins with 
the case m = 2, and to fix ideas, we consider the second derivative with respect 


to x, first for the case in which V’ is in V. By equation (9), this derivative, 
regarded as a function of P, satisfies 


The derivative in the first term on the right satisfies a uniform Hélder con- 
dition in V’ with exponent A, and by the reasoning of Lemma 1, the first term 
then satisfies a uniform Hélder condition in V’ with exponent )’’. 

In establishing the same property for the second term we shall need the 
following lemma: Let o be the portion of V in a sphere of radius a about an in- 
terior point of V’, a being so restricted that o contains no points of S other than 
points of X. Then any derivative of the second order of the potential 


i 


* See Korn, Sur les équations de l’élasticité, Annales de l’Ecole Normale, (3), vol. 24 (1907), p. 28. 
Although the bounding surfaces of the volumes considered there have bounded curvatures, the rea- 
soning can be extended so as to apply to any bounded volume. 


4 
} 


164 M. M. SULLIVAN [January 


is bounded at the center of the sphere, and this bound depends only on =. 

If the sphere is entirely interior to V such a bound is seen at once to exist. 
We assume then that there are points of = in the sphere. Through P, the 
center of the sphere, there will pass a normal to = at one of these points . 
If we denote by # the larger of the two regions into which the tangent plane 
at p divides the sphere, we may write 


wis ws2ff fra, 


the integral being taken over a set of points containing all points in a, or #, 
but not in both (i.e. in o+i—o-?#). But if we refer to tangent-normal axes at 
p, and denote by p theprojection of r on the tangent plane this set is surely 
included between {= —Ap'* and {=+Ap'*. Therefore, 


4 Aptt® 4 


where k =max sec y on 5, and where r’ is measured from p to the point of in- 
tegration. It follows that 


k* 871A k®R® 


where R is the maximum chord of V. 

Considerations of an elementary nature establish the fact that the in- 
tegral over ¢ has derivatives of the second order which are bounded at P, and 
that these bounds can be taken independent of a and the position of the di- 
viding plane. 

Returning now to the proof of the theorem for »=2 and V’ in V, we con- 
sider the difference of the values of the second term in the above equality at 
two interior points, P; and P2. If we denote by o the portion of V in a sphere 
of radius 27:2 about P;, this difference may be written as A,+A:, where 


a= fff 


1933] THE DERIVATIVES OF POTENTIALS 


Then, if A’ is the coefficient in the Hélder condition on x, 


|Ai| s 2a’ fff ray. 


From this follows the inequality 


|A:| < 


for the first integral is not greater than the second.* 
Passing to As, we have 


= Ao: — Azo + Avs, 


where x; and x2 are the values of x at P; and Ps. The coefficients of (x;— x2) in 
As; and Ae are bounded independently of P; and ri, the first by Theorem XI 
and the second by the lemma just proved. Therefore, 


| Aes | Keria, 
| Ase | s 


There remains A,; which may be written in the form 


A fff wa 


where the integration with respect to s is from P; to P; along the segment join- 
ing them, and where the integrand is continuous in the field of integration. It 
follows that 


* See Kellogg, Foundations of Potential Theory, loc. cit., p. 148, Lemma III. 


165 
2ri2 i 
0 
S Kiri, 
4 


M. M. SULLIVAN [January 


8» 1 
| Acs | < isa’ fff — dVds, 
8) 


where 7 is measured from a point of P;P2. Using the inequalities r,<2n, 


r=r:/2, we have 
| < 576A’ f f f f r’—*dVds 
V—e 
< 23044" f f r?’—drds 
8; 


23049A’ 
Ss ‘rie S 
1— 

When the inequalities are assembled it appears that the second term 
satisfies a uniform Hdélder condition in V’ with exponent \’. Therefore, 
0°U/dx? satisfies a uniform Hélder condition with exponent ’’. Similar 
reasoning holds for the other derivatives of second order, and the theorem 
holds for m=2 when V’ is in V. 

We turn now to the case in which V’ is exterior to V save for common 
boundary points on 2. As we have seen, there is a neighborhood W of any 
interior closed portion of 2 with the property that through any point of V 
there passes a normal to = at an interior point of =. This neighborhood can 
be so chosen as to include all points of 2 in V’. We shall confine ourselves to 
the points common to NW and V’, for since U is harmonic at all other points of 
V’, its derivatives satisfy Hélder conditions uniformly in this remaining por- 
tion of V’. 

Let P; be a point in V’,—because of what has been said we shall from now 
on omit the specification that it be in V,—on the normal to = at p,. Let P; 
be a second point of V’ such that the parallel through P, to Pip; meets 2 in 
the single point #2, a situation always attainable by a uniform restriction on 
P,P:. We refer V to tangent-normal axes at p; and write 


Ox? |e Ox? 


J | - of f feos (v, |, 


= Ai + Ag, 


166 
=— 
where 
Ai 
Ao 


1933] THE DERIVATIVES OF POTENTIALS 167 


x; and xz being the density at p; and p> respectively. 
Since x is in C’’, and since cos (v, x) is in C’ on 2, we have, by Theorem 
VIII and by the reasoning of Lemma 1, 


| Kins, 


K, being an appropriate constant. Here 71: is the distance P:P2; and since the 
integrals are computed at P; and P2 whereas x and x are the values of x 
at p: and fz, it is important to notice in the establishment of the above in- 
equality that p:p2 does not exceed the projection of times (= max sec 7), 
and hence 


S 


In considering As, we note that the points P; and P, may be thought of as 
not on &, for if a function, continuous in a closed region, satisfies a uniform 
Hélder condition in the interior, it satisfies the same uniform Hdélder condi- 
tion in the closed region. 

Using now a sphere of radius 2712 about #; and calling the portion of V in 
this sphere o, we may write 


As = Aoi + Azo, 


where in A,; the field of integration is ¢,and in Ax», V—¢. Then 


| Aes | s2ff 
+2 


Q being the point of integration. If we let r: = ~:0, and r2 = p20, we have 


where A’ is the coefficient in the Hélder condition on x. From this follows the 
inequality 


| 


since the first integral is not greater than the second. 
Passing to Ax», we have 


| 

| 

1674’? f 

S Korie, 


168 M. M. SULLIVAN {January 


SSS 


= Aso: + Anos. 


We may write 


0 1 
f f cos (», 2)— dS, 
Ox s’ T 


where S’ is the boundary of V —¢, and 7 is measured from P,. Since x satis- 
fies a uniform Hélder condition with exponent \’, A»: will satisfy an in- 
equality of the desired form if the derivative of the simple spread is bounded. 
That this is so may be seen by writing it in the form 


“ff. cos dS = f feo (v, dS — cos (», 


(12) 1 


where 2; is the portion of = contained in a, and 2 is the rest of the boundary 
of o. 

The first term on the right, being a derivative in a fixed direction of the 
potential of a simple spread with density of class C’, is continuous, and there- 
fore bounded in V’. The same reasoning cannot be applied to the remaining 
two terms because of the presence of edges at distances from ; which ap- 
proach 0 with ry». But they are, nevertheless, bounded, uniformly as to Pi 
and /;. This may be seen as follows. 

In the second term |cos(v, x) | <Ar:*, where 7; is the distance from p; to 
the integration point, because 2, and hence  ,, are of class C*, and since 
cos (v, x) vanishes at p; on account of the position of the axes. Hence 


1 1 
—ff cos (vy, x)—dS| < ff | cos (v, x) | — sec y dS’ 
Ox 2; r r? 

ki*A f f 


1933] THE DERIVATIVES OF POTENTIALS 169 


where r’ is the projection of r and 7; on the (x, y)-plane, where k =max sec y 
on ~; and the integration is over the projection of 2. The last integral is uni- 
formly convergent and vanishes with 72. 

In the third term, we have 2r1.Skr, where 2ri2 is the radius of 2. and 
k=max sec y. Hence, the term is not greater in absolute value than 


1 1 
[fteseff 
r? 22 (2r12)? 


| Ace: | Kn. 


Thus 


Finally, we have to consider A222, which may be written in the form 


where the integration with respect to s is from P; to P; along the segment 
joining them, and where the integrand is continuous in the field of integration. 

Let a denote a constant such that in the sphere of radius a about p; the 
angle between any two normals never exceeds 7/6; a can be selected inde- 
pendently of :, because of the uniform continuity of the direction cosines of 
the normal to &. If we restrict 712 to be less than a/4, we may confine ourselves 
to the portion v of V between the sphere of radius a about f;, and the sphere 
o, of radius 27; for, in the remaining field, the integrand in Az is uni- 
formly bounded, and the corresponding integral, accordingly, does not ex- 
ceed in absolute value a constant times ry. For the rest, 


82 1 1 
(k — ke) — dVds 18A’ ro’ — dVds 
4 
184’ f f f f dVds, 
v r 


where 7 is measured from a point of the segment PP», and rz is measured from 
p2. For any fixed position of the point of integration, 72/7 is greatest when r 
is measured as nearly as possible perpendicular to the normal to = at fy, in 
which case 


IIA 


IIA 


2 
re S (r + PiP2) max secy S rd + riz). 


Moreover, 


M. M. SULLIVAN [January 


2rie 
Fae = (31/2 1)ric. 
max sec y 


Accordingly, 


33 1 
f ff (k — ke) —dVds|=s 1assa’ ff 
8s 2a 
= 583204’ f ro’—dreds 
a; 2rio 


58329A’ 


1 


s 


When the inequalities are assembled, it appears that 0?U/0x? satisfies a 
uniform Hélder condition in V’ with exponent \’’. The reasoning requires no 
modification in the case of other derivatives of the second order, provided 
one direction is tangential. In the case of 0?U/dz?, with the above orientation 
of the axes, the only point at which modification is necessary is in the proof 
that the second term in the expression (12) is bounded. For, cos (v, z) =1 
at p:. In this case, however, the term becomes, when we use the projection 
>i of E, on the(x, y)-plane, as the field of integration, 


0 1 
Oz zy zy’ 


This can be shown to be bounded, independently of P:, p: and riz, by com- 
paring it with the same derivative of a spread of unit density on the flat 
region 


and noting that p/r lies uniformly between two positive bounds. 

When the details are supplied, the proof of Theorem XII for n=2 is 
complete. 

For n>2, we use the identity (11) and assume that the theorem has been 
established for derivatives of order lower than . Then the first term in (11) 


170 
Hence, . 
2 Ti2 
— —(1+—} < 3. 
31/2 r 
zy 


1933] THE DERIVATIVES OF POTENTIALS 171 


has continuous derivatives of order n—1 in V’ which satisfy there a uniform 
Hélder condition with exponent A’’, by Theorem VIII. The same is true of 
the second term, by our assumption about the derivatives of order lower than 
n. Therefore, the theorem is true for derivatives of order n. Since it has been 
established for m= 2, it follows that it holds for any m= 2. 

12. Logarithmic potentials. The potential of a logarithmic distribution on 
a plane curve can be interpreted as the potential of a distribution on an in- 
finite cylinder with elements perpendicular to the plane of the curve. Fur- 
thermore, the potential of the distribution on the portion S of the cylinder, 
outside two planes parallel to and on either side of the plane of the curve, is 
continuous and has continuous derivatives of all orders at all points of the 
plane of the curve.* The situation is the same for logarithmic double distri- 
butions and for logarithmic distributions on plane areas. 

Knowing this, we can see immediately that all the theorems established 
in this paper for surface or volume distributions hold also for logarithmic 
distributions on plane curves or areas, without alteration other than the 
appropriate changes in dimensionality. 

* Kellogg, Foundations of Potential Theory, loc. cit., p. 174. 


RADCLIFFE COLLEGE, 
CAMBRIDGE, Mass. 


THE DEGREE OF CONVERGENCE OF A SERIES OF 
BESSEL FUNCTIONS* 


BY 
M. G. SCHERBERG 


A number of problems of mathematical physics require the expansion of 
an arbitrary function in terms of Bessel functions in a manner analogous to 
the expansion of such a function in trigonometric functions. An important 
series in Bessel functions, necessary to the solution of one group of problems, 
the most familiar of which are the problems of the vibrating circular drum- 
head and the flow of heat in a cylinder, is the Bessel series, which has the 
formt 


f(x) = 


in which the B, are constants and the X,’s are the positive roots of the equa- 


tion 
(A) + hJo(A) = 0 


where either /=0 or 4/]>0. 
The more general series to be studied in this paper has the form 


(1) f(x) = B(x) + SY 


in which the )’s are the positive roots of the equationft 


(2) () + = 0 


while B(x) is an additional term which is present when (2) has also a pair of 
imaginary roots +7Ao. Since the functions x/*J,(A,x) form an orthogonal set § 
over a range of integration from zero to one, the coefficients B, are found in 
the usual formal manner and are 


The function B(x) =0 when /=0 or h4//+v>0. Otherwise it has a form de- 
pending on whether //]+- is equal to or less than zero.|| 


(3) Bn 


* Presented to the Society, September 11, 1931; received by the editors April 25, 1932, and, 
in revised form, June 20, 1932. 

¢ Watson, Theory of Bessel Functions, 1922, pp. 596-597; Byerly, Fourier’s Series, pp. 12-14. 

t The notation J,’(x) for (d/dx)J,(x) will be used through the paper. 

§ Gray and Mathews, Treatise on Bessel Functions, 1922, p. 91. 

|| Watson, loc. cit., pp. 596-597. 


172 


SERIES OF BESSEL FUNCTIONS 173 


That the series converges to the value of the function f(x) under suitable 
restrictions on the function, and the range of the variable, has been shown by 
several writers.* The degree of convergence of a series is the order of magni- 
tude of the difference between the function and the first » terms of the series. 
Thus, if those restrictions are placed upon the function f(x) which insure the 
convergence of the series to the proper value in a defined range of x, the de- 
gree of convergence of the series may be calculated as the order of magnitude 
of the remainder after ” terms. 

To avoid undue repetition, a convention of symbol is made at this time. 
K will designate constants independent of x and m and depending only on 
such fixed quantities as vy, the number of discontinuities in f’(x), etc. The 
function @ will be any function of any number of variables which is numer- 
ically less than one for all values of the variables considered. The notation 
6:(x) will indicate a function which has one for an upper bound and which 
has a bounded derivative with respect to x. 


I. THE DEGREE OF CONVERGENCE IN THE ABSENCE OF HIGHER 
DERIVATIVES 
Lema 1. If F(x)/x has bounded variation in the interval 0 Sx S1, then 
K0(An) 
By means of the asymptotic formulat 
2 x) 
(5) J(x) = (—) {cos (x — a) + 


TX x 


on setting =F (x)/x, we have 


f = ( 


2 1/2 1 K 
) f @(x)x!/? cos (Anx — a)dx + —- 
0 


Since — @,(x) in which ,, &, are monotone increasing, we may 
assume without loss in generality that ® is also monotone increasing and 
hence, by the second law of the mean, 

* C. N. Moore, these Transactions, vol. 12 (1911), pp. 181-200; also Watson, loc. cit., pp. 576- 
605. 


T Lipschitz, Crelle’s Journal, vol. 56 (1859), pp. 193-196; Watson, loc. cit.; C. N. Moore, loc. cit., 
p. 189. 


v20, 


174 M. G. SCHERBERG 


1 
f @(x)x!/? cos (Anx — a)dx 
0 


g 1 
= 0) f cos (Anx# — a)dx + — f cos (Anx — a)dx 
0 


KO(rn) 
Lemma 2. In the interval 0<x <1 let the function f(x) be absolutely con- 
tinuous and let f'(x) have bounded variation. Then the general coefficient B, of 
the series (1) may be written as 


6; = 


{" when 1 ¥ 0, 
+ 3 when 1 = 0. 


The treatment in this part of the paper is similar to that employed by 
C. N. Moore* in a paper on the uniform convergence of a Bessel series. 
The denominator of B, may be writtent 


f (dax)dx = (Xn) + (dn) } — 


and with the aid of (5) is readily reduced to the form 


0 


By means of (6), B, now assumes the form 
1 
(7) [ha + 
0 
On integration by parts with the aid of the recurrence formula 
d 
(8) —[xJ,41(x)] = 
dx 
the integral in (7) becomes 


* C. N. Moore, loc. cit., p. 183. 
t Byerly, An Elementary Treatise on Fourier Series, 1902, p. 224, formula 12. 


[January 
where 


SERIES OF BESSEL FUNCTIONS 


f a (x)J,(Anx)dx 
0 


_1 ¢'_F@ 

Ando (Ana)?! 
1 1 

= (x) + Xn vf (x) r41(Anx) dx 


_ 
An 


+ = [f(%) — dx. 


d[ | 


vf (0 1 


If we assume, as we may without loss in generality, that f’(x) is 20 and 
monotone increasing, then (1/x) [f(x) —f(0)] will be also positive and mono- 
tone increasing. 

It follows that the last two integrals i in the last line of (9) have the form 

The integral [,J,4:(Anx)dx of (9) may be written 


since the integrals in question all converge. Although it is not necessary to 
go beyond the fact that the integral {>} J,41(x)dx is a function of v alone, its 
value one* will be utilized. By means of (5) 


= f (=) ‘sin (x —a)dx+K f = KO(s) 


An 43/2 


Thus 


Ke, 
(10) f dx = + 


To complete the proof of Lemma 2, it remains to reduce the term 
S(1)J+41(An)/An Of (9). A formula for the roots of equation (2) due to Mooret 
gives 


(11) An = or +q+—= Ky(n)-n, 15 y(n) S 2, 
nN 


* Gray and Mathews, loc. cit., p. 65, Formula 8. 
¢ C. N. Moore, loc. cit., pp. 189-196. 


1933] 175 
(9) 
+ — — — af" (x) J r41(Anx)dx 
An 0 An 0 
| 


M. G. SCHERBERG 


2 1 


v+1 


and & is an integer, positive, negative or zero. 
From (11) 
K6(n) 
sin (A, — @) = sin (mm +q—a)+ 
n 


K 
nN 


| Ké@(n) 


nN 


and (5) 


2 1/2 2 1/2 Ké@ Ké 
12 vei(An) = 1)” — — 
(12) June) = + = 


in which 56, is defined as in the statement of Lemma 2. The conclusion of the 
lemma follows from a combination of the above results. 


THEOREM I. Let f(x) be a function such as described in Lemma 2, and, in 
addition, let the conditions 5,f(1) =vf(0) =0 be satisfied. Then 


Ko0(n, x) 
f(x) — = 0326 1, 
n 


/2 


where S,(x)— B(x) is the sum of the first n regular terms of the series (1). 


It has been shown* that under the conditions of the Lemma 1, the series 
will converge to the value of the function in any sub-interval of 0<x <1 hav- 
ing zero as an end point provided f(x) is continuous in this sub-interval and 
the product vf(0) =0, and that it will converge to the value of the function 
in a sub-interval of 0<x<1 having one as an end point if again f(x) is con- 


*C. N. Moore, loc. cit., has shown the convergence to f(x) under conditions which insure 
“closure” and hence the convergence to f(x) under conditions of the lemma follows if there is con- 
vergence at all. 


176 pl [January 
where 
1 = 0, 
1 #0, 
4 
1 = 0, 
140, 
| 


1933] SERIES OF BESSEL FUNCTIONS 177 


tinuous in this sub-interval and the product 6,f/(1) =0. It, therefore, follows 
that under the conditions of Theorem I the series will converge to f(x) 
throughout the interval 0<*<1. Further, from Lemma 2 the general term 
of the series assumes the form 

K0(An) 


Since J,(A,x) is uniformly bounded,* this general term may be written 
K6(n)/n*!2 and the remainder after terms becomes 


THEOREM II. Let f(x) be a function such as described in Lemma 1. Then 
in the interval OS asxSb<1 
K@(x,n)  Kérf(1)0(n, x) KO(n,x) vf(0)K6(x, n) 
(1 — x)n'/? 3/2442 3/24 3/2 


f(x) — S.(x) = 


where unless vf(0) =0, b¥1 unless 5,f(1) =0, and 
K0(n, x) x) 


43/242 


= 0 whena = 0. 


Under the conditions of Lemma 2, the series (1) becomest 


yo 


m=n+1 m=n+l1 m 


(13) + > 


m 


m=n+l 


By means of (5) the general term of the last sum becomes 
COS — KO(Am) 
raf 
which with the aid of (11) yields 
K6(\m Ka(n, 
K6QAm) (n, x) (n, x) 
ngil? 


m=n+l1 


n2x3/2 


The first sum of (13) is zero when y>0 and x=0 since J,(0) =0. On the 
other hand if y=0, then J,(0)=1, and this sum is that of an alternating 


* Watson, loc. cit., p. 44. 
+ The convergence of the separated parts will be apparent in what follows. 


0s 238 1. 


178 M. G. SCHERBERG [January 


series of numerically decreasing terms. It has a value, then, which is nu- 
merically less than the first term or K@(m)/(n+1)"?. When x>0 the sum- 
mation is made in three parts. 

Let r and s be the smallest integers greater than (w—1) which satisfy the 
inequalities >: and \,41% >ke where and are the first and second 
positive roots of J/ (x) =0. Now when « is small r and s will surely be larger 
than (n+1) and this sum may be divided into three parts in the first two of 
which J,(A,x) is monotone. With the arguments of the >°’s omitted, they are 


By means of (5) 
“ 2 1/2 (— 1)” cos (Amz — a) (— 
a4) (5) | | 


m=s+1 Xm m=s+1 Ane 


= 04 + 


The sum og; converges absolutely, hence 


2 A? ? 


But 


in which by means of (11) 
— = Ay) (Z), << E < 
= K6(k)x. 


m=s+1,8+3,++* 


_ K&(s+1) Ko(s+1) Ko(n+1) 


> ke. 


The sum o, remains to be treated. By means of (11) one readily finds 


cos — a) = cos + q)x — a] + 


|| 
Thus 


SERIES OF BESSEL FUNCTIONS 


2 (— 1)™ cos — 


TX m=s+1 Am 


2\" > (— 1)™cos [(mr + 9)x- a] K6(s + 1) 


mr (s + 1)!/2 


since 
XAm > ke, m>s. 


The sum on the right is a linear combination of the real and imaginary 
parts of 


2 1/2 x 1 2 1/2 C-) 1 
m=s+1 m=s+1 MT 


If 1) =¢, 


Silo) = = emtid = g(stl)rie — = ( 
m=s+1 1— 


then, by the classical transformation of Abel, 
emrie 1 


1 
= —— §, —(S, — S, 
41+ 42 41) + 


( 1 1 )+s ( 1 1 )+ K6(, s) 


Therefore 


> (— 1)™e™** K6@(s + 1, x) 


mas+1 mr (1 — x)(s+ 


As has already been pointed out J,(A,«) is monotone in the sums o; and 
g2, and hence they are readily reduced to sums of terms which alternate in 
sign and decrease numerically. They will each have the form 


Ké(n) 


1933] 179 
and 
1 1 K0@(m) 
hw mr m? 

Hence 

= | 
|| 


180 M. G. SCHERBERG [January 


The second sum of (13) may be reduced by the methods employed on the 
first sum. It is found that 
(— 1)"J,(Amx) K0(n, x) K0(n, x) 


3/243/2 


The sum (13) is now readily reduced to the form of Theorem II. 


II. THE DEGREE OF CONVERGENCE WHEN HIGHER DERIVATIVES OF f(x) EXIST 


Just as in the case of other well known expansions, the convergence is 
more rapid when higher derivatives are present provided other conditions 
are suitably adjusted. The Bessel series requires, in general, rather strong re- 
strictions at the end points of the interval in which the function is repre- 
sented. 

In view of the great similarity of the procedure in Part II to that of Part 
I, the results in the former will be merely stated and the detailed proofs left 
to the reader. 


Lemma 3. In the interval OS x1, let f(x) and its first (p—2) derivatives 
be continuous and let f°?-»)(x) be absolutely continuous while f(x) has bounded 
variation. Then the coefficient B, of J,.Anx) in the series (1) may be written as 


m=p—1 s=p—l (- 5, 


B, = + | 


s=0 


1 ¢ (— 1)*f(x)k(p, s, 


where d*f(x) 
and dx* 
k(p, 5, v) — 1 — cos? 2)! 
= (« —2+ cost e 


x (p- - syst) 


us 
q + 1+ cos? — 
2 2 


X 


n 
ry 


1933] SERIES OF BESSEL FUNCTIONS 181 


in which v(v+1) --- (v+q—s—1) and 1-3-5 --- are to 
be replaced by 1 when g=s, q=1 respectively; and the notation q=s, 1 
implies that g=s when s>0 and g=1 when s=0. 


The proof of Lemma 3 is readily obtained after simple but lengthy cal- 
culations by integrations by parts with the aid of recurrence formula (8). 


Lemma 4. Let f(x) be a function such as described in Lemma 3. Suppose, 
further, that f(x) together with its first (p—2) derivatives vanish at the end points 
x=0 and x=1. Then the coefficient B,, of the series (1) may be written 


+ 


n n n 


where 5, is defined as in Lemma 2, c is an undetermined integer and R(p, v) 
= x5 '(—1)*k(p, s, v) (R(p, s, v) as in Lemma 3). 


THEOREM III. Let f(x) be a function such as described in Lemma 4, and, in 
addition, let the conditions 


8? — cos* (rp/2)fP-Y(1) = R(p, »)f-(0) = 0 


be satisfied. Then 
K0(n, x) 
— S,(x) = 


nP-1/2 


THEOREM IV. Let f(x) be a function such as described in Lemma 4. Then in 
the interval OSasxsbs1 


wp 
6? — cos? f?-(1)0(n, x) Koln, x) 


K0(x, n) ( 
+ (1 om /2y pth 
R(p, x) 
+ 


f(x) — Sa(x) = 


in which a¥0 unless R(p, v)f?- (0) =0, unless 
5? — cos? (rp/2) = 0 


and 
K0(n,x)  K0(n, x) 
pth 


=0 whena=0. 


= 
K 


M. G. SCHERBERG [January 


III. ON THE MAGNITUDE OF THE CONSTANTS 


To make more definite the results of the previous sections, the magnitude 
of the constants which occur in some of the formulas there used have been 
computed. Due to the fact that the calculations are rather lengthy and can- 
not be easily summarized for presentation, these results will be given in an 
informal manner. It was found for x20 and a=(2y+1)7/4, 


J,(x) = (=) [eos (x — a) (2x)/2? > 5/2, 


cos (= — x > 0, 


2\1/2 21/26, (x) 
) | c0s a) + | 0s vs 3/2, x>0, 
x 


TX 
2\1/2 200,(x) 

- (=) [cos a) + | 0s+25/2,x>0, 
3x 


that the positive roots of J,(x) larger than 


~) when » > 5/2 


and larger than 
80 T 
when 0572 5/2 
3x 2 


are given by the formulas 
(4/3) 
a 
200 
3a, + 3x(m + k) 


An = ai + (n+ + vy > 5/2, n 22, 


0<»<5/2, 2, 


An = + (n+ + 


2 
where & is an integer not less than (—3) for the first equation, and not less 
than 0 for the second; that the positive roots of AJ? (A) +/J,(A) =0 larger 
than 4K/r—7/2 are given by the formula 


2Ke 2v+1 
1#0,a= 
a+ x(n + k) 4 


where & is an integer not less than (—3) and 


An = at(n+k)rt+ 


182 
\ 3a 


SERIES OF BESSEL FUNCTIONS 


h/l 2.2:/2 


or & is a positive integer or zero, and 


K = + | r+ 


UNIVERSITY OF MINNESOTA, 
MINNEAPOLIS, MINN. 


1933] 183 


ON THE PROPERTIES OF POLYNOMIALS SATISFYING 
A LINEAR DIFFERENTIAL EQUATION: PART I* 


BY 
I. M. SHEFFER 


Introduction. Sequences of polynomials such as the Legendre, the La- 
guerre and the Hermite polynomials appeared in mathematics many years 
ago, and their properties have been investigated by numerous people. They 
satisfy simple difference equations, and also are solutions of linear differential 
equations of second order. The expansion problems (in the complex plane) 
associated with them are not so old. For the Legendre polynomials the region 
of convergence was determined by C. Neumann.7 More recently the conver- 
gence regions for the Laguerre and Hermite polynomials were treated by 
O. Volk.f The paper of Volk considers, more generally, the boundary value 
problem (in the complex domain) for a second-order linear differential equa- 
tion, not restricting attention to polynomials. The mth-order equation has 
since been treated, as a boundary value problem, by L. Bristow.§ 

Up to the present, however, there has been no general study of the proper- 
ties of polynomials satisfying a linear differential equation of order higher 
than two. The present paper has in view such an investigation. There is 
another aspect to our treatment. In an earlier work we considered the proper- 
ties of arbitrary sets of polynomials,|| associating with each set a linear dif- 
ferential equation, usually of infinite order. We obtained certain formal 
properties, whose complete justification required convergence proofs. The 
present paper deals with these matters for the case of a finite order equation. 

§1 is preliminary: we state two theorems of Perron, and prove a corollary 
that is of use later. §2 introduces a fundamental differential equation whose 
polynomial solutions {y,(x)} we investigate, as well as the entire function 


* Presented to the Society, December 27, 1929, under the title The polynomial solutions of 
linear differential equations; Expansions; received by the editors May 10, 1932. 

t Uber die Entwicklung einer Funktion mit imaginérem Argument nach den Kugelfunktionen 1. 
und 2. Art, Halle, 1862. 

t Uber die Entwicklung von Funktionen einer komplexen Verinderlichen nach Funktionen, die 
einer linearen Differentialgleichung zweiter Ordnung mit einem Parameter geniigen, Mathematische 
Annalen, vol. 86 (1922), pp. 296-316. 

§ Expansion theory associated with linear differential equations and their regular singular points, 
these Transactions, vol. 33 (1931), pp. 455-474. 

|| On sets of polynomials and associated linear functional operators and equations, American 
Journal of Mathematics, vol. 53 (1931), pp. 15-38. We shall refer to this paper throughout as Sets. 


184 


ital 


POLYNOMIAL SOLUTIONS OF DIFFERENTIAL EQUATIONS 185 


solutions {D,(#)} of the dual equation. §3 deals with general theorems on 
sets; §§ 4 and 5 give inequalities and resulting theorems of expansion; and 
in §6 we obtain certain biorthogonality relations and differential equations 
for functions allied to {y,(x)} and {D,(é)}. 

The important problem of expansions in the polynomials {y,(x)} has 
hardly been touched. It demands considerations of another order from those 
of the present paper. Accordingly, we postpone its treatment to Part IT. 

1. Preliminary: The Perron theorems. We have need of the following 
two theorems (A and B) due to Perron:* 


THEOREM A. Consider the rth-order difference equation 
(i) + +--+ + = 0 (i=0,1,---). 
Let lim;.., a;; exist, =a;,7=0,1,---,r—1, and let - - - , be the distinct 
absolute values of the roots of the characteristic equation 
ataz+:--+27 =0. 


Let ¢m=the number of zeros of absolute value gn(eit --- +em=r). Then if 
2:90 for all i, there is a fundamental set of r solutions divided into k classes, 
such that the mth class contains em of these, and these em solutions satisfy the 
condition lim sup |x, |!/"=qm. 


|THEOREM B. In the system of equations in infinitely many unknowns 
(ii) + Din) Xitn = 
n=0 
let the following conditions hold: 
dotbo~O (i =0,1,---); limsup| <1; 
| bin| S kO",O<O0<1; lim k; = 0; 


F(z) = is analytic in| z| < 1. 
0 


If in lz | <1, F(z) has n zeros (multiple roots counted multiply), then the general 
solution of (ii) satisfying the condition lim sup |x, |!/*<1 contains n arbitrary 
constants. 

It is not apparent from the statement of Theorem A that a solution 
{xn} (not =0) cannot be formed for which lim sup |x, |!/"<min (q:, - - - , ge)- 
As we need this fact, we shall establish 


* Uber S gleichungen und Poincarésche Differenzengleichungen, Mathematische Annalen, 
vol. 84 (1921), pp. 1-15. 


186 I. M. SHEFFER [January 


Lemna 1.* Under the hypotheses of Theorem A there is no solution not iden- 
tically zero for which lim sup |x, |!/"<min(q, , qx). 


Regarding x= (xo, 11, - - - ) as a vector and L[x] as the vector operator 
that carries x into the vector y with 7th component 
Vi = + Xi+ry 
let us determine an operator M that is inverse to L: ML[x]=x. Let the ith 
component of M [x] be 


Then we are to have, identically in the {x;}, 


This gives the equations 
= 1, 
(a) Ms + * + Me = O (i 1, r); 


Since aio +0 for all 7, the quantities m;; exist and are unique. M is then deter- 
mined. It remains to consider convergence. Let s be fixed, and set 


(b) Nk = Ms Dk ri = Us+k,i- 
Then we have the equations 
(c) + Mina H+ = 0 (i =0,1,---) 


for mo, m%, +--+. It is easily verified that the conditions of Theorem A hold 
for (c), so that for every solution {,} of (c) we have lim sup|m,|!/*< 
max (|4:|,---, |-|), where 4, - - - , ¢, are the zeros of 1+a,.t+ - - - 
=0. Now if min (q:, - - - , gx) =0 the lemma is vacuously true. We may then 
assume that min (qi, - - - , gx) =A>0, in which case Then - - -, ¢, 
are the reciprocals of the roots of the characteristic equation of (i), so that 
lim sup |”, To €>0 we have for all s. 

Now suppose a solution {x,} of (i) exists such that lim sup |x, |" =5<X. 
Then to e’>0 we have |x,|<C(e’)(6+e’)". Let y=max(|aio|,---, 
for all 7. Then 


* A statement, without proof, of this lemma is given in Nérlund, Differenzenrechnung, 1924, 
p. 309. 


bial 
i=s 


1933] POLYNOMIAL SOLUTIONS OF DIFFERENTIAL EQUATIONS 187 
| {L[x]};| | Gx Lite | 

where the definition of H is obvious. Therefore 


| {a(z[x]]}.| < 


t=0 


On choosing e¢,e’ small enough the infinite geometric series converges. Hence, 
when we substitute L[x] into M, forming ML[x], and in the sth component 
combine coefficients of the same x,’s, we obtain an absolutely convergent 
series; the process is then legitimate. But L[x]=(0, 0,---), and since 
ML|x]|=z, it follows that x=(0, 0, - - - ). This proves the lemma. 

2. Solutions of a differential equation and its dual. Our principal aim is 
the study of the polynomial solutions of the &th-order linear differential 
equation 


(1) LL y(x)] = Lo(x)y(x) + Li(a)y"(x) + + y(x) = 
where 
(2) L(x) = lio + + + (i = 0, k) 
is a polynomial of degree not exceeding i and is a parameter, and of its dual 
equation (soon to be defined). We define \, by 
(3) An = loo + + n(m — +--+ - + n(n —1)---(n—k+ Alix. 

THEOREM 1. [f* \m¥An, m¥n, and if 1,40, the equation 
(4) y(«)] = d(x) 
has an entire function solution (40) if and only if has one of the values 
A=Xo,A1, and when , there is just one entire function solution, namely 
a polynomial y,(x) of degree exactly n. 

To demonstrate this, substitute into (2) the power series y(x) =)-o ynx". 
On equating coefficients we find the following equations for the y;: 
(5) (An + On + On .nt2Vn+2 + + On ntkVntk = 0 

(n = 0,1,---), 


* If 1iz=0 or Xm=An for some mn, it is necessary to modify some of our later arguments, and 
we leave such considerations out of the present paper. 


188 I. M. SHEFFER 


where 
On, n+1 = Lio(m + 1) + +1)n+---+ Li +1)n--- (n —k+ 2), 
Tn. n¢2 = loo(m + 2)(n +1) + 2)(n+1)n+--- 
+ + 2)(n + 1)---(n — k + 3), 


nt+k = Lio(n + k)(n + k— 1) (n + 1). 


For definiteness let us suppose that* /,9#0. We then obtain the difference 
equation 


(5’) = 0 (n =0,1,---), 


with 


n=2 On,n+k On nt+k 
The characteristic equation of (5’) is 
(8) Lik + +--+ + = 0 0). 


Let a(>0) be the least absolute value of the roots of (5’). Then, by Lemma 1, 
for every solution {y,} we have lim sup |y, |!/"2a@ provided A¥Xo, 
Hence in this case y(x) => $y,x", with radius of convergence <1/a, cannot 
be an entire function (unless y(x) =0). 

Now let A solution of equations (5) is seen to be 
=0, y, arbitrary (but 0), and Yn-2, - - Yo determined uniquely and 
successively (in view of A\m#Xn) from the (n—1)st, (n—2)d, - - - , Oth equa- 
tions of (5). Hence one entire function solution of (4) for( \=A,) is the poly- 
nomial 


(9) n(x) = Yno + + Yank", Yan ¥ 0. 


To show that there is no other entire function solution, ignore the first 
n+1 equations of (5’). The system of equations remaining has the limits (7) 
for its coefficients. Moreover, now the coefficients of that y, of lowest index in 
each equation is different from zero, so that Lemma 1 again applies, and the 
only entire function solution (for the modified system of equations) is the 
function zero. That is, the only entire function solution of (4) is one whose 
coefficients beyond x” are all zero. And, since a polynomial solution of degree 
<n is unique, as we have seen, the theorem is established. 


* If lio=0, (5) becomes a difference equation of order <k, but the same conclusion will follow. 


[January 
Tn.n+k 
| 


1933] POLYNOMIAL SOLUTIONS OF DIFFERENTIAL EQUATIONS 189 


Derinition. The numbers {dn} may be defined as the characteristic numbers 
of equation (4). 


Let us in (1) replace each y‘ by #‘, thus obtaining the function 
(10) L(t; x) = Lo(x) + Lilx)t + + Le(x)e*; 
or, 
L(t; x) = (loo + +--+ + + + lot? + --- 
(10’) + +--+ + 
= Lot) + Lie 


Now in (10’) replace each power x‘ by the derivative D‘(#) of a function* 
P(t). This gives us a differential expression which we term the dual of (1): 


Evidently, (1) is the dual of (11). Similarly, we term equations (4) and 


(12) L[D)] = 
dual equations. 


THEOREM 2. If Xm¥An, m¥n, and 1,,~0, equation (12) has a formal power 
series solution about the origin if and onlyt if X=Xo, x, - - - ; and when K=X,y 
there is precisely one such power series, D,(t)(40), and it is an entire function 
with an nth order zero at the origin. 


To prove this substitute the power series D(#)=)-d,t" into (12) and 
equate coefficients. This yields the equations 
(13) (An d)d, + + On, + + On = 0 (n=0, 1, 
where 
= Lio + loi(m — 1) + — 1)(m — 2) +--- 
+ — 1)(m — 2)--- (n—k +1), 
(14) = leo + — 2) +--+ + — 2)---(n —k +1), 


= lio. 


If A¥Xo, Au, - - - then successively we get d)=d:= - - - =0, so that the 
only formal power series solution is D(#)=0. Now suppose A=A,. Then 
dy=d,= --- =d,_1=0, d, is arbitrary, and d,41, dn42, - are successively 


* See Sets, loc. cit., pp. 31-32. 
¢ Consequently, we may say that equations (4) and (12) have the same characteristic numbers. 


190 I. M. SHEFFER [January 


and uniquely determined (in terms of d,). Hence there is just one formal solu- 
tion, >.;_,, dit’. We proceed to show this solution is an entire function. The 
first n+1 equations of (13) drop out, giving us 


(As An)ds + 5-1 Os ,s—kds—k =0 (s + 1, nN + 2, ); 
and, on setting r=s—k, we get the difference equation 


r+k—1 r 
(13’) — An — An 
with 
Artk 


(15) lim ———— = 


An 


and with the characteristic equation 
(16) = 0. 


By* the Perron Theorem A, for every solution of (13’) we have lim sup |d, |*/” 
=0; and this implies that D(¢) is an entire function. 

We can say even more: 

Corotiary. The solution D,(t) duit’ corresponding to satisfies 
the inequality 


lim sup | D,” (0) \\/* < » = maximum absolute value of the zeros of L,(x), 


so that the function A,(t) defined by 


(17) A,(t) = 


i=n 


has a radius of convergence at least equal to 1/p. 
To show this, let d,;=v,/i!; then (13’) leads to the difference equation 


An 
(13”) 
An 


with 


* If Jio=0, (13’) is a difference equation of order less than k, but the same conclusion follows. 


1933] POLYNOMIAL SOLUTIONS OF DIFFERENTIAL EQUATIONS 


r=0 Api — An kk 


(¢=1,---, k), 


and to the characteristic equation L,(¢) =0, whose largest zero is in absolute 
value p. Now the coefficient of v, in (13’’) is never zero.* Hence, from 
Theorem A, lim sup |2, |!/" <p; and from this the corollary follows. 

D,(#) is the so-called Borel entire function associated with A,(¢) and the 
two are related by the following integral: 


1 ef 1 
(18) D,(t) = — An (-) du, 


where C is a closed contour surrounding the origin and lying wholly outside 
of |u| =p. 
Similarly, if we set 


(19) Y,(x) = Ol yno + 1! +--+ + 2! 


then y,(x) is the Borel entire function for Y (x), and we have 


1 1 
(20) ya) (—)aw, 


I being a closed contour surrounding the origin. 


DEFINITION. An analytic function f(t; x) is self-dual with respect to the 
above operators L, £.,, operating respectively on the variables x and t, if 


LIfe; «)] = LLG; 
COROLLARY. e'* is a self-dual, and 
(21) = = x). 


3. Associated sets of functions; the sets P,, Q,. Let us now consider the 
following parametric differential expressions corresponding to (1) and (11): 


(22) Ly[y(«)] = L[y(x)] — dy(x), 
(23) = L[DO] 


Define the set of polynomials P,: {P,(x; )} by 


(24) ) = Ly[x"] = (Lo(x) — + + --- 
+ n(n (n =0,1,---), 


* That is, if J4040. Should /%9=0 (here and hereafter), the remark of a previous footnote applies. 


191 
| 
| 
| 
} 


192 I. M. SHEFFER [January 


and the set of functions P,: { P,(¢; )} by 

= Lalt"] = (Lol) — + +--- 
(n =0,1,---). 

We see that P,,(x; X) is a polynomial in x, of degree 1 if \¥Xo, Au, - - - , and 


that ?,(¢; A) is a polynomial in ¢, of degree not exceeding n+, with a zero 
at the origin of order at least n. 


(25) 


DEFINITION. Let H(t; x) be a symbol for a formal power series in t with 
coefficients that are formal power series in x, so that when it is expressed formally 
as a power series in x, the coefficients are (formal) power series in t: 


H(t; x) ~ ~ 350, (1) x"/n! (h,(x), power series). 


Then we say that the two sets of functions {h,(x)}, {3,(t)}(n=0, 1, - - - ) are 
associated sets. 


Lema 2. The two sets P,, Py given in (24, 25) are associated sets, and, for* 
all x and t, 


(26) d)t"/n! = SOP, (t; dA) x*/n! = x) = (definition) P(t, x; X), 
n=0 n=0 
where 
(27) x) = Lit; x) —X. 
(26) follows from (24), (25) and (21). The convergence of (26) is im- 


mediate. 
If we expand (24), we obtain 


(28) = (An = x” + + + + Cat ad”, 
where the o;; are given by (6). 
In our theory of sets of polynomials? we considered the multiplication of 
sets. Thus, if P: {P,(x)},Q: {Qn(x)} are any two sets, where 
P,(x) = pao + Paik + PanX", On(X) = qno + Quit + 
then PQ is the set { PQ,(x)}, where | 
PQn(x) = PnoQo(x) + pniQi(x) +--+ + PanQr(x) = 0,1,---). 


In particular, a set Q is the inverse of P if PQ=I where J is the identity set: 
I,(x) =x". It is easy to see that an inverse Q exists if and only if P,,(x) is of 


* (26) is formally true in the general theory of sets of polynomials. 
Tt Sets, p. 16. 


0 0 
oo 


1933] POLYNOMIAL SOLUTIONS OF DIFFERENTIAL EQUATIONS 193 


degree exactly m for every n; and in this case Q is unique, and P is also the 
inverse of 0: OP =I. 

Now P,,(x; A) is of degree exactly m(A#Xo, i, - - - ). Hence the set P, 
possesses an inverse set Q,: {Q,(x;)}. Since P,Q, =/, there follows from (28) 


Lemma 3. The set {Qn(x; d)} satisfies the difference equation 


(An — A) A A) A) = 2” 


29 
(n = 0,1,---). 
Examination of the first few Q,.(x; \)’s suggests 


Lemma 4. Q,,(x; X) is a rational function of X, having at most* simple poles 
at A=Ao, An. 


This is true for »=0, 1 as is readily seen. The Lemma then follows by 
induction from (29). 
From (24) we find that 


(30) Ly[Qn(x; = QPa(a; d) = 
If we multiply through by #"/! and sum formally from  =0 to , we obtain 
(31) L,[Q(t, )] = 


where 


(32) Q(t, = 


We proceed to show thatf (32) is uniformly convergent in ¢ and x in some 
region, thus making (31) valid. 

On dividing (29) through by on_:x,n, we obtain a system of equations for 
Qo, Qi, - - - satisfying all the hypotheses of Perron Theorem B, the function 
F(z) here being Li(z)/Ixo, provided |x| <1. If Li(z) does not have all its 
k zeros in |z|<1, we cannot apply Theorem B to all the solutions of the 
system in question. This difficulty can be overcome by modifying the system 
as follows: 

Let p again denote the largest absolute value of the zeros of L,(z) and 
set Q,(x; \) =p*R,(x; X). We then obtain for {R,} the system of equations 


* Individual Q,(x; \)’s may fail to have a pole at some of these points; e.g., Q:(x; A) will not 
have Xo as a pole if io=0. It is however easy to prove that Qa(x; A) always has A, as a pole (i.e., if 
x is not given special values). 

t Q(t, x; \) is a function of \ as well; we must therefore avoid such values of \ as will make 
Q(t, x; A) singular. 


I} 

i 

i 
n=0 

i 

if 

a 

i 


194 I. M. SHEFFER [January 


(a) R, - (s =0,1,-+-). 
Os Os Os ,s+kP 
This system satisfies the conditions of Theorem B, if we restrict x to lie in 
|x |<p, and the function F(z) is here (1/I:o)Lx(zp). Now the characteristic 
roots all lie in |z|<1 so that by Theorem B, the general solution of (a) for 
which lim sup |R, |!/*<1 contains & arbitrary constants, exactly as many as 
enter into the general solution of (a). Hence, every solution { R,} of (a) satis- 
fies the condition lim sup |R, |!/*<1. We thus have 


Coroxtary 1. For |x| <p, 


(33) lim sup | Q,(2; (AH No, +, 


We have, moreover, uniformly* in |x| <p, 
(33’) | Qn(x5d)| S Clo +6). 
Here ¢>0 is arbitrary, and C does not depend on x. 


If we had defined R,(x; A) by Q,(«; A) =6R,(x; A), the argument 
made above would continue to hold, giving us 


2. For |x| <6, 6(=p) arbitrary, 
(34) lim sup | On(x; d) | (A # Xo, Al, An); 


and 

(34’) | On(a; )| + 

uniformly in |x| <6. Here is arbitrary, and C; depends only on and 6. 
Since 6 may be chosen arbitrarily large we have 


THEOREM 3. The function Q(t, x; ) given by (32) is analytict in t, x, d; it 
is an entire function in t and x, and its singularities in \ are at most at the points 
A=Xo, Ar, + Moreover (31) holds for every t, x, (A#Xo, 


Let us return to the theory of sets.[ We have there given the 


DeEFIniTION. A triangular set of functions {P,(t)},n=0,1,---, is a set 
of formal power series in t such that P,(t) begins with a power of t not less than n: 


P,,(t) Trl” + + 


* The uniformity can be established from the Perron proofs. 

Relation (34’) is also uniform in every bounded )-region, the points \=Xo, Au, being de- 
leted. 

t Sets, p. 32. 


1933} POLYNOMIAL SOLUTIONS OF DIFFERENTIAL EQUATIONS 


For such sets multiplication is defined as follows: 
PQ = {PYM}: ~ wan + Quill) (n= 
If, in particular, PQ=5 where 3: {5,(#) =t"} is the identity set, then Q is the 
inverse of P. Such Q exists if and only if 7,,+0, »=0,1,---,andthen Q 
is unique, and P is the inverse of Q. 

Let P: {P,(x)} be a polynomial set, and P: {®,(¢)} the associated set. 
(? is then a triangular set.) On setting 

P,(x) = Pno + Pnix + + Panx", P,,(t) Tart” Tn + 
it is seen that the property of being associated sets is equivalent to the rela- 
tions 

Tan = Pany 


Tanti = Pnti.n/((n + i)(n + i- 1) a (n + 1)) (i = 1, 2, oe ). 


We can establish the following general theorem on sets: 


(35) 


THEOREM 4. (a) Let P, P be associated sets, and Q, Q their respective in- 
verses. Then Q, Q are associated sets. 

(b) If P, Q are inverse sets, and P, Q their associated sets, then P, Q are 
inverse sels. 


Consider (a). Let 
Qn(x) = + Gund", Qn(t) ~ + 
From PQ = and PQ=35 we obtain the relations 
PnoQo(x) + PanQn(x) = tan Qn(t) + i Qnilt) 


1,i=0 
0,i=1 
1,i=0 
0,i=1 


(a) PanQn.n—i + Pn.n—19n—1,n-i + + Pn n—iQn—i,n—i = 


(B) TanKkn + Tn nt+1Kn+1,n+i Tr nt+iKn+in+i = { 


The theorem will be proved if we establish that 


Knn = 
Kn nti = Qn+i.n/((m + i) (n + 1)) 
or, that 7=0, 1, where 


(i) 


San = 
Santi = Qn+i,n/((m + i) (n + 1)). 


In equations (a), substitute for gn,n4: its value in terms of 5n4:,n. Moreover, 
in the resulting equations, leave the first equation unchanged, replace by 


(ii) 


i 
q 


196 I. M. SHEFFER [January 


n-+1 in the second, m by m+2 in the third, and so on. This gives the following 
equations (a’), equivalent to equations (a): 

1,i1=0, 
0,%=1,2,---. 


(a’) nti + ,n+i-1 + + Tn ntiSnn = 


_ We verify at once that San=Knn, SUPPOSE = for 
j7=0,1,---+,%-—1. We shall then prove it for 7 =i, and the theorem will be 
demonstrated. Denote respectively by Eno, Em, Fno, Pm, the left 
hand members of (a’) and (8) for i=0, 1, - - - . Now form linear expressions 
in the s’s and x’s, respectively: 

The coefficient in E; of iS readily seen to be 
and this is also the coefficient of kn4:—p,n4i—p+q in F;. Hence from E;=F;, and 
our induction assumption, it follows that tan = 
But Tan=Pnn~0, n=0, 1, - - - (since P possesses an inverse). Therefore, 
Sn,n+i = Kn,n+i and the induction is complete. 

To establish (b), let Q* be the inverse of P. Then by (a), Q and Q* are 
associated sets. But the associate of a set is unique, so that Q= Q*. 

Consider the associate sets P,, P, of (26). The coefficient of # in P,(¢; 
) is not zero for A¥Xo, Au, - - - - Hence P, possesses an inverse Q): { Q,(¢; 
d)}, and by Theorem 4 we have the 


COROLLARY. Q) and Q) are associated sets, and 
(36) Olt, >) = = 
n=0 n=0 


the series converging uniformly in every bounded x, t, region (on deleting the 


From the definition (25) of P, we have (using (14)) 
(37) Palt; = (An — engi + +--+ + 


On equating corresponding coefficients in P,Q, =3 there results the following 
difference equation for Q,(¢; 


(58) (n = 0,1,---). 


Now Q, is the inverse of P,, so that we have 


(39) [Q,(¢; d)] = 


1933] POLYNOMIAL SOLUTIONS OF DIFFERENTIAL EQUATIONS 


(40) Lal Q(t, = d)] = (See (31, 36).) 


On applying the operator C, to (32), we then have e'*=)>0°Q,(x;A)L [é"]/n!, 
or 


(41) = 0) Palt; 0)/nl. 
n=0 

Similarly we obtain 

(42) = 9) Pa( d)/nl. 


The series in (41, 42) converge uniformly for all ¢, x, \ bounded (on deleting 

From the way in which (41) and (42) were established it is clear that they 
hold formally in the general theory of sets. Indeed we can state the more 
general 


THEOREM 5. Let P be any polynomial set, and let Q be the associate of the 
inverse of P. Then 


n=0 
For, let L be the differential operator* (in general of infinite order) that 


carries the identity set J into P: L[x]=P,(x), and let Q be the inverse of P, 
so that L[Q,(x)] =x". Since Q and Q are associate, 


Q(t, x) ~ (i) 


Then, 
x) YQ OL[x"]/n! ~ YQ.) Pa(x)/nl, 


and this is (43). : 
If P is a polynomial set (if Q is a triangular set) then Q (P) is uniquely 
determined by (43). Hence we have the converse 


THEOREM 6. Jf P, Q are any polynomial and triangular sets, respectively, 
satisfying (43), then Q(P) is the associate of the inverse of P(Q). 


We can now complete Theorem 3. Define 
R(t, x; +) = (A — A). 
* That L exists is established in Sets, p. 29. 


197 
n=0 
4 
0 0 
n n 
t 


198 I. M. SHEFFER [January 


Since Q;(x; X) contains at most a simple pole at X=), R,(é, x; A) is analytic 
at A\=A,. Q(t, x; X) has then, at \=X,, at most a pole of first order. Now 


x; = LrlQ] +0 = + OU, 


therefore C[R,(t, x; x; A), and (on letting 
L[Ra(t, Xn) ] =AnRalt, An). Now An) can be expanded about ¢=0. 
Hence by Theorem 2 it can differ from D,(¢) at most by a factor independent 
of ¢:R,(t, x; kn) =Hn(x)D, (2). Similarly, L[Ra(t, 2; An) ] =AnRalt, x; An), and 
since R,,(¢, x; Xn) is an entire function in x, we must have (Theorem 1) 

(i) R,(t, Xn) = (cx = constant). 


We agree from now on that in y,(x) and D,(t) we shall choose the coeffi- 
cients of x", " respectively to be unity: 


(ii) Yan = 1, den = 1. 
From (29) we see that the coefficient of x" in (A—A,)QO,(x; A) is —1, and since 


R,(t, x; = { lim (A — An)On(x; 
i=n 
the coefficient of in R,(t, x; An) is —1/n!. Hence, by (i, ii), = —1/m!. 
Now let +, be a closed contour in the A-plane surrounding the point \ =A, 
but containing in its interior and on its boundary no other \;. Then 


1 
Q(t, x; = lim (A — A, )O(E, A) = Ralt, x; An)- 
J 4, 


THEOREM 7. At each of the points \=Xn, Q(t, x; X) has a simple pole with 
residue —¥,(x)D,(t)/n!, so that 


1 


From (44) we have 
— yn(x)D,(4)/n! = lim (A — x; d). 


On using (32), then, 
— yn(x)D,(t)/n! = lim (A — An)On(x; 
Ah, 
+ lim — (mf AE 


so that on equating coefficients of ¢”, 


1933] POLYNOMIAL SOLUTIONS OF DIFFERENTIAL EQUATIONS 
(b) — yn(x)/n! = lien (A — An)On(x; A)/n!. 


From (b) follows the relation 


Using (29) in (b), we can express y,(x) in terms of the Q,’s: 
(44’) = { An) + + On—k An) } 


This relation can be reversed rather simply as follows: 
Equating like powers of ¢ in (a) gives 


Yn(X)dn (A — An)On+i(%; d)/(n + i)!, 


so that y,(x) is, with a certain constant factor, the residue of Q,,;:(x; d) at 
Hence is, to within a factor, the residue of Q,(x; A) at A=An_i, 
and there exist certain constants gno, » (independent of x and d) 
such that 

yo(x) y1(*) 


The gn;’s can be found successively from recurrence relations. 
The relation (44) suggests the partial fraction expansion 


= yn(x)D,(t) 
45 4,2) ———. 
(45) Q(t, x; d) nel 
Assuming (45) to converge uniformly, and applying the operator Ly, we ob- 
tain 


(46) 
n=0 


We shall establish the validity of (45, 46) after we have developed some 
necessary inequalities. Let us observe, however, that (46) is of type (43) so 
that by Theorem 6 (and this holds for the general theory of sets) we have the 


Corotzary. The sets {yn(x)}, {Dn(t)} are respectively the associates of 
each other’s inverse. 


4. Some inequalities for y,(x), D,(¢). Let \, be defined as in (3). 


| 


200 I. M. SHEFFER [January 


Lema 5. For all n and for all i>0, 


( ) nt+i = 1! n T n ’ 
where C is a positive constant independent of both n and i. 

To show this, we can write 
(a) Anti — An = B(n, i) + + + + Aoxi* | 
where B(n, i) is a polynomial in » and i of degree <k; and a simple calculation 
gives for the A,, the values 

Ax-1,1 = k/1!, = k(k 1)/2!, Ax = ki/k!. 

The bracket on the right hand side of (a) agrees, then, with the bracket in 
(47). It is now a straightforward argument to show: first, that for or z (or 
both) sufficiently large, the bracket is the dominant term in (a); secondly, 
that for i and m both bounded, a C exists. (47) then follows. 

With a properly adjusted C we have the 


CorRo.Liary. For all n and alli>0, 
(48) | Anti — An|  Ci(m + i)*-! (C > O, independent of n and i). 


The coefficients in the series 


D,(t) = 


i=n 


are given (see (13)) by* 
s=n+i,n+2,--- 
with d, =1. Let S be a positive number such that 
(b) OSi, fsk. 
Then 
Lemna 6. For all n and i(i>0) we have, with C as in (48), 


1 / Sk 
(50) | | s —(=)(1+=) 


(50) is readily established (by use of (48)) for i=1, 2. Assuming it to be 
true up to i—1, we shall prove it true for 7 by induction, as follows: 


* For simplicity we use d, for dns. 


1933] POLYNOMIAL SOLUTIONS OF DIFFERENTIAL EQUATIONS 
1 
| Anti = An 
Sk 
< 
Ci(n + i) 


| dn+i [| | + + | | ] 


i-2 i-3 


dbi-*-1 
+ + (n+i 


where a =(Sk/C), b=1+(Sk/C); 
i! G n+i (n + i)? 
(n+ i) 


a*hi-*-1 a2pi-—k-1 


| dns | = 


-1 i! i! 


i! 
and this is (50). 
Consequently, 


LEMMA 7. 


i-1 


a ab a 


- inti 4 << pire, 


a, b, p=max (1, a) independent of n. 


We now turn to y,(x). Its coefficients satisfy (as we see from (5)) the equa- 
tions 


(52) (As An) Vs + ,s+1Vs+1 + Os + + Os = 0 
(s = 0,1,---,#— 1), 
with y, =1. 
Lemna 8. The coefficients of yn(x) satisfy the inequality 


for alin andi (in). 


The proof, by induction, differs very little from that of Lemma 6, and 
may be omitted. From (53) we get 


202 I. M. SHEFFER [January 


9. 


n n(n — 1) n! 
K + — + ———_ + -- - +— 
1! 2! n! 


(54) 


n n! 
<K al +---+ K q(x + 5)", 
! n! 


g=max (1, a/b) independent of n, so that for all n and x, 
(55) | ya(x) | S g(|x| + 
Combining Lemmas 7, 9: 
| yn(x)Dn(t)| pge?!*![| + (for all x, 2). 


Hence we have 


TueoreM 8. The series >0¢ yn(x)Dn(t)/m! converges uniformly in every 
bounded x, t region, and represents an entire function in the two variables x, t. 


Corottary. The series converges uniformly 
in every bounded x, t, region (the points \X=Xo, du, being deleted). 


The above two series are the right hand members of (45, 46). It remains 
to prove that they represent the corresponding left hand members. Let 
H(t, x; ) denote the sum of the series in the above corollary. H and Q have 
then the same principal parts at A=Xo, Au, - - - so that Q—Z is an entire 
function in all three variables. On applying the operator Z, term-wise (as we 
may) to the series H, we get 


(a) Ly (Q(t, x;r) — H(t, x; d) = — > yn(x)D,(t)/n!. 


Hence the left hand member is independent of \, and represents a function 
C(t, x) that is entire in ¢ and x: C(t, «)=)>%n-0 Cmax™”. The right hand 
member of (a) has zero as coefficient of xt", m2n, so that Cmn=0, mZn. 
Hence 


(b) Ct, x) = 


n=1 


where ¢,(x) is a polynomial of degree not exceeding n—1. 
Now e‘* and )-¢ y,(x)D,(t)/m! are self-dual functions; the same is then 
true of C(t, x): 


(c) L{C(t, «)] = 


0 


1933] POLYNOMIAL SOLUTIONS OF DIFFERENTIAL EQUATIONS 203 


On substituting into (c) the series (b), and equating coefficients of like 
powers of #, we obtain the equations 

Lo(x)en(x) + Li(x)eg (x) + + Lilx)en™ (x) 
(d) An€n(X) + Gia + Qn n—2Cn—2( X) + + On n—kCn—k(X) 

=1,2,---). 

It is at once verified that? ci(x)=0. Assume that ¢:(x) =c2(x) = - 
=C,-1(x) =0. We shall show that c,(x) =0. On our induction assumption, (d) 
reduces to 
(e) L[en(x) |] = Anen(x). 


c,(x) is an entire function, so that by Theorem 1, ¢,(x)=@nyn(x), dn a con- 
stant. But c,(x) is of degree less than ; hence a, =0, and c,(x) =0. That is, 
C(t, x) =0, and the right hand member of (a) is zero. 

We have just established (46). Equation (a) then gives us 


(a’) L;.[Q(t, x; ) — H(t, x; d)] = 0, 


where (Q—H is entire in ¢, x, \. By Theorem 1, (a’) has an entire function 
solution (#0) if and only if X=Xo, Ai, - - - . Hence Q0Q—H=0,A¥ Xo, Mi, 
and by continuity Q—H =0 for all ¢, x, \. That is, Q=H, and this is (45). We 
thus have 


THEOREM 9. The two series 


n=0 (An d)n! n=0 


are valid for all t, x, (Ao, deleted). 


COROLLARY. The expansion 


(56) P(t, = — d)¥n(x)D,(t)/n! 


n=0 


is uniformly convergent in every bounded t, x, \ region, thus representing an entire 
function in all three variables. (See (26).) 

5. Further inequalities; expansions in D,(#), Y,,(x). In the present section 
we investigate the question of expansions of functions in terms of the two 
sets {D,(t)}, {¥.(x)} (introduced in (19)). For this we require some further 
inequalities. In the equation 


For =constant=c, say. Then so that cli=0. Now if then \1=)o, 
which contradicts our assumption (Am An, mn). Hence 410, and c=0. 


4 
4 


204 I. M. SHEFFER [January 


(4’) L[yn(x) ] = 

make the transformation « =«* —~. On setting y,*(x*) =y,(x) we have 

(4”) ] = Anyi (x*), 

where L* is an operator similar to L, L;*(x*) being equal to L,(x). In particu- 


If we denote by {D,*(t)} the set of functions corresponding to {y,*(x*)}, 
then by (46), 


n=0 
But y,*(«*) =yn(x), e'** =e'*-e'. Hence D,(#) is transformed into 
(57) D,*(t) = e'D, 

Choose 


Then /* ,_1 =0; i.e., the sum of all the zeros of L,*(x*) is zero. There is clearly 
no loss in generality if we go from (4) to (4’’) for the choice of y in (58). Then, 
dropping asterisks, we shall henceforth suppose that in (4’), li,x1=0. 


Lemma 10. For all n and s(s>0) we have 


1 
(59) | dante | s-—-— =1, 
n s! 


h being independent of n and s. 


To show this, substitute in equations (49) for the d’s the numbers 7; 
defined by 


{a) = Ndnys, S = 0,1,°--, = 
This gives 
(Ants = An) Tate + Ants n+s—1"n+s—1 + + Ante nts—klnis—k = 0 


(b) (s =1,2,---). 


Solving for r,4,, and using the values of the a;;’s (given by (14)) as well as 
the relation /;,,-1=0 and the inequalities (48), we find that 


| Tr+s—1 | | Tare | | Tn+3s-3 | 


n+s n+s (n + s)? 


| | | 


™ =n, | rate | 


(c) 


E 


1933] POLYNOMIAL SOLUTIONS OF DIFFERENTIAL EQUATIONS 


where h=>Sk/C. Choose h =max (2, Sk/C). On setting 


(d) th = 1, = (h/s) [= + tn+s—3 ] 


nts (nts) 
we have 


A simple calculation gives us fn4:54/1!, tny2<h?/2!; we shall establish 
the relation 


(e) h*/s!, s > 0, 


by induction, assuming it true up to s—1. For 


(s—1)\(m+s) (s — 2)\(n +s) (s — + s)*-! 


1 [1+ 4 s—2 
stints h h(n + s) h?(n + s)? 


(s—2)---(s—k+1) 


1 


= 


gl 


sints h 


which is (e). Then, |r,4.|<4*/s!, and from this (59) follows. From (59) we get 
THEOREM 10. D,(¢) is asymptotically given by 


(60) D,(t) = [1 + 
where 
(61) | An(t) | 

THEOREM 11. Let C(t)=)0o at” have r as its radius of convergence. Then 
the seriest C*(t) ¢nDa(t) 

(a) converges absolutely at every point in |t| <r; 

(b) converges uniformly in |t|<r' <r, 1’ arbitrary; 

(c) diverges in |t|>r. 

In particular, D,(t)-expansions have circles, center at origin, as their regions 
of convergence. 


In fact, for m sufficiently large, 
cot"| S| cnDn(t) | =| -|1 + | 2| cnt” | 


t If r=0, the only point of convergence for the C*(#)-series is #=0. 


205 

| 
| 

4 


206 I. M. SHEFFER [January 


In C*(t) let lim sup |c, =0< so that the series has a 
radius of convergence r=1/a¢>0. We may expand C*(#) in a power series in 
t: C¥*(t) with 
(i) = Codon + 1d in Crd nn (n = 0, 1, ). 
Since, by hypothesis, |c, | <A(o+e)", A =A,, we have (see (50)) 
| c*| < A[a(1 + + + + — 1)! 4--- 

+ (o + + (6 + 


a=Sk/C. That is, 


o+e(1! 2!\o + n!\o +e 


a 
<A E + ——. (o + €)" Bio + B independent of n. 


ot+e 
Therefore lim sup |c,* |!/"<o, and we have 


Lema 11. If C*(t)=DoocnD,(t) has the radius of convergence r, then 
C*(t) =)oo cn*t" has a radius of convergence at least as great as r. 


The converse theorem is also true. Its proof, which is not so immediate, 
can be made to depend on inequalities regarding the functions Y,(x) of (19): 


(19) = O!yno + + + 
Set Zn; =7!yni/n!. Then, by (5), 


(As — An) (Ss +R) + (S + + +R) (S + + 
+ =O (s =0,1,--- ); 


(62) 


with z, =1. We find, using (48), that 
| z.-1| | z.-2| n> 1, h = max (2, Sk/C), 
1!” 2!n 

and an induction gives us (compare Lemma 10) 

Lemma 12. For alln>1 and alli (0<isn), 
(ii) | hé/(ilm). 

We can write 

Vn(x)/m! = + + +++ + 


1(1/h 1/h\? 1 /kh\* 
= + (=) ou + (=) + —(—) |, 
nt 2!\ x n!i\x 


1933] POLYNOMIAL SOLUTIONS OF DIFFERENTIAL EQUATIONS 


where |6,;| <1 for all 2 and i. This gives us 


THEOREM 12. Y,(x) has the asymptotic form 


(63) Y,(x)/n! = + B,(x)/n] 
wheret 
(64) | Ba(x)| < 


Let us now consider the expansion (46) for e‘*. This is (in the variable x) 
the Borel entire function associated with 1/(1—éx), and y,(x) is the Borel 
function associated with Y,(x), thus suggesting the expansions 


(a) 1/(1 — tx) = = TO, 2), 


n=0 
(65) Y,(1/x)D,(¢) 
x(n!) 

To verify this and determine the region of validity of (a, 65), we appeal 
to relations (60) and (63), which show us that 7(?, x) is analytic in any region 
for which lta | <1, and that the series for T(t, x) converges uniformly for 
| te | <a <1. Let x trace the circle |x |=5>0. The series in question then con- 
verges uniformly for |¢|<a/5, and we may multiply it by e*/“/u and integrate 
term-wise with respect to around |u| =6: 


1 f T(t, u) 1 f Y,,(u) 
— = e*/“dur, = 


0 n! 2ri u 


1 Y,,(u) 
— e*/"du = y,(x). 
2riJr 


If we now expand 7 (¢, ~) in a power series about u=0 (as we may): 


T(t, u) = 


we find that T,,(¢) =#", so that (a) is true for [tx |<1. Therefore (65) holds. 
Now by (63), lim sup|Y,(1/x)/m!|"=1/|x|; whence from Theorem 11 
follows 


For x=0, | Yn(x)/n!| =| Sh"/(n! n). 


| 
| 
since 
| 
q 
0 
4 
i 


208 I. M. SHEFFER [January 


THEOREM 13. For every x the series (65) has the interior of the circle |t| = |x| 
as its regiont of convergence. In every region |t|<p< |x| the convergence is 
uniform. 

If f(t)=Dco fat" is analytic in |t|<r, we get from (65), by Cauchy’s 
integral formula: 


2 43 Y,(1 
(66) fo= J alow, 


n=0 x(n!) 


valid for |¢|<r’<r. But r’ can be chosen as close to r as we desire; hence (66) 
is true for all |¢|<r, and is uniformly convergent in every closed region in 
|t| <r. This is the converse of Lemma 11. Combining the two we have 


THEOREM 14. The function f(t) has a convergent D,,(t)-expansion if and only 
if it is analytic about t=0; and its D,(t)-expansion and its power series expan- 
sion have the same radius of convergence. 

A D,(t)-expansion is unique. For if >-¢¢,D,(t) converges, it converges 
uniformly (Theorem 11), and we may write where 
the c,* are given by (i). If the c,*’s are given, these equations (i) determine 
Co, C1, uniquely. 

Let us sum up our theorems on D,,()-series: 


Turorem 15. The series doo has the single point of convergence t =0 


if and only if lim sup |c,|!/"= ©. If lim sup|c,|/"=0< 00, then the series 
converges throughout |t|<1/o and diverges throughout |t|>1/o. In |t|<1/o 
the convergence is absolute, and in every closed region in |t|<1/o it is uniform. 
If f(t) denotes the sum of the series, then f(t) is analytic in |t|<1/o, and 1/o 
is the radius of convergence of f(t) =) fat”, .€., the Dn(t)-series and the power 
series for the same function have the same radius of convergence. Furthermore, 
a D,(t)-expansion is unique, and the coefficients are given by 


c 


201 x n! 
C being a contour about x =0 and lying in |x| <1/c. 
We turn now to Y ,,(x)-expansions. From (60, 63, 46) we readily deduce 


Lemma 13. The function 1/(t—x) has the expansion 


t— amd t(n!) 


t Points of the boundary may be points of convergence. 


1933] POLYNOMIAL SOLUTIONS OF DIFFERENTIAL EQUATIONS 209 
which for a given t has the interior of the circle |x| = |t| as its region of conver- 
gence. In every region |x| <p< |t| the convergence is uniform. 

In ways analogous to those used for Theorem 15, we can establish 


THEOREM 16. Everything said} of D,(t)-expansions in Theorem 15 holds for 
{V.(x)/n!}-expansions, with the modification that the c,’s in 
are now given by 


_i¢ 


C being a contour about t=0 and lying in |t| <1/c. 


6. Biorthogonality relations; differential equations for A,(#), Y,(x). In 
the equation 


C being a contour surrounding u=0 and lying outside of |u| =p, replace et 
by its expansion >) ?_oy.(u)D,(#)/s!, which converges uniformly on C. This 


gives us 
1 2(u)A,(1 
D,(t) = {— f 
=o \2ride slu 
whence by uniqueness of D,(#)-expansions we have 


THEOREM 17. The functions {yn}, {An} are biorthogonal in the following 
sense: 


(70) 


1,s=n. 


n! 


If we start with the relation 
1 ez 
2QriJr u 


I being a contour around u=0, we obtain the uniformly convergent expan- 
sion 


(a) y(t) = {— f 


slu 


As we have not established uniqueness of y,,(x)-expansions, we cannot at once 
conclude that the brace in (a) is zero or one. But this can be proved in the 
following way. 


t We must of course except the conclusion of Theorem 15 for the case lim sup new |¢n |/"= 0. 


4 

4 


210 I. M. SHEFFER [January 


Denote by the brace in (a), so that uniformly con- 
vergent for all bounded wu. Multiply both members by A,(1/u)/(m!u) and 
integrate term-wise over the contour C of (70). This gives us, by (70), 

! 
Snr = > i.e., Snr = 
s=() n! 
OF Car =0, FN; Cnn =1. Hence we have 


TueoreM 18. The functions {D,}, {Y,} are biorthogonal in the sense 


1 D.(u)¥n(1/u) = (O,s #n, 


n! 


(71) 


1,s=n. 


By means of (70) we can show that a y,(x)-expansion is unique if it con- 
verges uniformly in a region that contains the region |x | <p+e,e>O0 sufficiently 
small. Here p is, as before, the maximum absolute value of the zeros of L;(x). 
This is equivalent to saying that if the function zero has a y,-expansion that 
is uniformly convergent in |x|<p+e, then the coefficients in the expansion 
are all zero. This result follows on multiplying the series in question through 
by (1/u)(1/n!)A,(1/u) and integrating over C, using (70). 

We can sharpen this conclusion by further considering the functions 
A, (t). We know D,(# is the Borel entire function corresponding to A,(¢). In 
general, 


If the two functions ant", B(t) n!ant" are ana- 
lytic about t=0, then A(t) is the Borel entire function associated with B(t), and 
we write A(t)= BEF { B(t)}. 


It is easily established that 
Lemma 14. If A(t)=BEF{ B(t)}, then A'(t)=BEF {(B(t)—B(0))/t}. 


Lemma 15. For all 0 Sj Si, 


dt' 


This can be established by a straightforward induction argument, using 
Lemma 14; the proof may therefore be omitted. 
Since D,(¢) satisfies the differential equation 


= LoDal) + + = 


where £;(¢) is a polynomial having no term ¢* with k<i, we may apply 
Lemma 15. It gives us 


1933] POLYNOMIAL SOLUTIONS OF DIFFERENTIAL EQUATIONS 211 


THEOREM 19. The functions A,(t) satisfy the kth-order linear homogeneous 
differential equation 


k ok 


da‘ 
i=j j=0 dt’ 
Here the coefficients of the various derivatives of A,(#) are polynomials 
in ¢, that of A,“ (#) being 
+ + + dix) = *L,(1/2). 


CoroLiary 1. The only possible singularities (in the finite plane) of the 
functions A,(t) are at the reciprocals of the zeros of L;(t). 

A,(1/t) has then only the zeros of L;(#), and the origin, as possible singu- 
larities. From this follows 

Coro tary 2. In relations (18) and (70), the contour C may be chosen as any 
contour which has the origin and all the zeros of L,(u) in its interior. 

By the argument used just before Lemma 14, applied to equation (70) 
with a contour C of Corollary 2, we have 

THEOREM 20. A y,(x)-expansion is unique} if it converges uniformly in a 
simply-connected open region R that contains the origin and all the zeros of 
L(x). 

The numbers {\,} are the characteristic numbers for our original equa- 
tions (1), (12). It is natural to inquire if they have like significance for equa- 
tion (73), regarded independently of its origin. The answer, in the affirmative, 
is given by 

THEOREM 21. The differential equation 


k 


di 
(73) p> = 


i=j j=0 


has a formal power series solution about the origin if and only if X=Xo, 1, + - , 
and when X=Xn, there is a unique solution (to within an arbitrary constant 
multiplier) ; this solution converges about t=0, and is, in fact, the function A,(t). 


To prove this, assume the expansion A(#) =)-o a,é". On substituting into 
(73’) and equating coefficients, the values Xo, Ai, - - - are found to be the only 
possible ones, and these yield unique solutions. The remainder of the theorem 
follows from the fact that A,(¢) zs a solution for \=),. 


+ That is, if two such expansions (uniformly convergent in R) represent the same function, cor- 
responding coefficients are equal. 


| 
| 
il 
if 
| 


212 I. M. SHEFFER [January 


We now turn to the functions { Y,(x)}. From equation (4) for y,(x) and 
the property that y,(x) =BEF{Y,(x)}, we derive the corresponding differ- 
ential equation for Y,,(x). Unlike the case for A,(#), however, the new equa- 
tion is of infinite order. In fact, we have 
(a) Y,,(x) Yo + Y\x + + 
where Y;=i!y;. (We write Y;, y; for From (a) follows 


(n—i+1) (n—i+2) 


Lemma 16. The coefficients va Y (x) are given byt 
(b) = [1/(n — («/1)Y, (x) + (x2/2)V (x) 
(= = 0, 1,-- m). 


To show this let 7,,(x) denote the right hand member of (b). Since Y (x) 
is of degree n, T,(x) is unaltered if we add to it terms containing higher 
derivatives of Y,(x) than the mth. That is, we can write 


rx) =[——] 


the series being uniformly convergent in every bounded region. Letting C 
be a contour surrounding the origin, we have 


and on substituting into T,(x), we obtain 


1 Y,,(u) 


c (u— x) 


s! u— x 


s=0 


T,(x) = 


Now the brace is the expansion of 


valid for |«/(u—x) | <1. We can choose C to satisfy this condition and also to 
surround the origin. Then, 


= (1/(2mi)) f du = Vai. 
Cc 
That is, (b) holds. 


t (b) is of course true for any polynomial of degree n. 


1933] POLYNOMIAL SOLUTIONS OF DIFFERENTIAL EQUATIONS 


Now let 0<j <i. From y,(x) we obtain 


(n —i+ 7)! (n—-i+j-—1)! 


= BEF + - 
(n — i)! (n —i—1)! 


+--+ 


and on using (b) of Lemma 16, this gives 
(c) xiy,(x) = BEF (x) + (x) - 
+ }, 
(@—i+j—1)! 


— al (p= i,i+1,---,m). 


(d) Op; 


We readily get 
Lemma 17. The quantity 0,,;; is the coefficient of u®-‘ in the power series 
expansion of e~“H;;(u), where H ;;(u) is the entire function 
= 
= 


If we substitute the value of x“y,‘?(x) as given by (c) into equation (4), 
for we obtain 


+ = AnV (x), 


which is a linear homogeneous differential equation of order m for Y,(x). 
(74) can be written in the form 


(74’) M(x)Y,(x) + (x) + + = AnV a(x), 
where 


r=0 


with s=k ifi=k, and s=iifi<k. 


213 
8 
| 


214 I. M. SHEFFER 


Clearly, M;(x) is independent of n. Since Y, (x) =0, s>mn, we see that 
the functions {Y,(x)} are solutions of the linear homogeneous differential 
equation of infinite order 


(76) (x) V(x) = A¥(x), 
s=0 
where for =Y,(x) we have X= ,. 
It is seen that M;,(x) is a polynomial of degree not exceeding 7, so that 
equation (76) belongs to the type considered in Sets (pp. 29-31). 


THEOREM 22. The only polynomial solutions of (76) are the polynomials 
{V,.(x)}, and the only characteristic numbers are therefore {X=\n}. 


For let Y(x) be a polynomial satisfying (76) with the value \=)’. Then, 
since the relation between equations (4) and (76) can be traced in both direc- 
tions, the polynomial y(x) given by y(x)=BEF{Y(x)} will satisfy (4) 
for \=X’. This can be true only if \’ is one of the numbers A,, and in this case 
we must have y(x) =y,(x). Hence Y(x)=Y,(x). 


PENNSYLVANIA STATE COLLEGE, 
STATE COLLEGE, Pa. 


ON THE RESULTANT OF A SYSTEM OF FORMS 
HOMOGENEOUS IN EACH OF SEVERAL 
SETS OF VARIABLES* 


BY 
NEAL H. McCOY 


INTRODUCTION 


One of the most fundamental problems in the theory of elimination may 
be stated as follows. Let 


(1) fi 


be a set of m general forms homogeneous in the » variables 21, x2, - - + , Xn} 
to determine the polynomial in the coefficients of these forms whose vanishing 
is a necessary and sufficient condition that the forms (1) simultaneously 
vanish for a set of values, not all zero, of the variables x1, x2, - - - , X». This 
polynomial is called the resultant of the system of forms (1). From this stand- 
point a numerical factor in the resultant is of no consequence though in cer- 
tain cases it is desirable to introduce some convention as to such a factor. 

The most important properties of the resultant of the system (1) are well 
known and have been obtained by various authors in a variety of ways. 
We give a brief account of the method used by KGnig? as it is of particular 
importance in the sequel. 

Let us denote by 


(1’) fi (¢ = 1,2,-+-,m) 


the general non-homogeneous polynomials obtained from (1) by placing one 
variable, say x,, equal to unity in each form. We now consider the module 
defined by these polynomials, that is, the system of all polynomials of the 
form 


(2) dift + + 


* Presented to the Society, September 9, 1931; received by the editors May 21, 1932. This 
paper was practically completed while the author was a National Research Fellow at Princeton Uni- 
versity. 

t See the Encyklopidie, vol. 1, pp. 260-273; also J. Kiénig, Einleitung in die allgemeine Theorie 
der algebraischen Grissen, Leipzig, Teubner, 1903, chapter VI; F. S. Macaulay, Algebraic Theory of 
Modular Systems, Cambridge University Tracts, No. 19, 1916, pp. 3-17; O. Perron, Algebra I: Die 
Grundlagen, Géschens Lehrbiicherei, vol. 8, 1927. 

t Op. cit. The exposition given by Kénig is based on earlier work of F. Mertens. For references 
see KGnig, op. cit., p. 271. 

215 


1,2,---,m) 
| 
i] 
1 


216 N. H. McCOY [January 


where the ¢; are also polynomials in 1, %2, - - - , X,»-1. It may be shown that 
there exists one and only one polynomial R in the coefficients of the poly- 
nomials (1’) satisfying the following two conditions: (i) R is a member of the 
module (2), and (ii) R is an irreducible function of the coefficients of these 
general polynomials. This polynomial R is defined by K6nig to be the re- 
sultant of the polynomials (1’) and also of the forms (1). The resultant as 
thus defined is identical with the polynomial in the coefficients whose van- 
ishing is a necessary and sufficient condition that the forms (1) vanish for a 
common set of values of the variables. However, this fact is not the center 
of interest from this point of view. The usual properties of the resultant may 
be obtained by a method of induction. 

It is the purpose of the present paper to consider a certain generalization 
of the concept of resultant from the point of view of modular systems. Let 


(3) Fi (i= 1,2,---,14 ai) 


j=l 


denote a set of general forms homogeneous in the variables of each of r(=1) 
sets, there being a;+1 variables in the jth set (j=1, 2,---, 7) and each 
a;2=1. Sylvester* seems to have been the first to consider the concept of a 
resultant of forms of the type (3) and although he did not define the re- 
sultant of such a set of forms, he stated without proof a general theorem re- 
garding the degree and weight of the resultant. This theorem is essentially 
our Theorem 3 (c,d) below. 

A definition and a brief discussion of the resultant of the system (3) was 
given by Lasker? from a point of view somewhat similar to that of the present 
paper. However, Lasker was not primarily interested in the structure of the 
resultant but in its use in generalizing certain theorems in the theory of 
modules and ideals. 

Certain special cases have been studied by different authors with, of 
course, varying points of view. Sylvester and Muirf{ have discussed the re- 
sultant of a system of forms linear in each of two sets of variables and have 
expressed the resultant in the form of a determinant in two or three different 


* J. J. Sylvester, On the degree and weight of the resultant of a multipartite system of equations, 
Proceedings of the Royal Society of London, vol. 12 (1862-63), pp. 674-76, or Mathematical Papers, 
vol. 2, pp. 329-330. 

+ E. Lasker, Zur Theorie der Moduln und Ideale, Mathematische Annalen, vol. 60 (1905), pp. 
105-107. 

t T. Muir, The resultant of a set of homogeneous lineo-linear equations, Transactions of the Royal 
Society of South Africa, vol. 2 (1910-12), pp. 373-380; J. J. Sylvester, On a question of compound ar- 
rangement, Proceedings of the Royal Society of London, vol. 12 (1862-63), pp. 561-563, or Mathe- 
matical Papers, vol. 2, pp. 325-326. 


1933] THE RESULTANT OF A SYSTEM OF FORMS 217 


ways. The case of three double binary forms has been considered by Moore 
and by the present author with the object of expressing the resultant in 
determinantal form.* 

In Part I we give a definition of the resultant of a system of forms of 
type (3) and deduce some of its fundamental properties. The outline of pro- 
cedure is essentially that of Kénig{ for the classical case r=1. Some of his 
results can be carried over immediately to this more general case and with 
one or two exceptions we shall refer to Kénig for the proofs wherever possible. 
However we give in some detail the demonstrations that involve any essen- 
tial modification or extension. 

Part II consists of a generalization of Sylvester’s dialytic method of 
elimination to certain cases of forms of the type here considered. The main 
result is Theorem 4. As special cases of this theorem we obtain the resultant 
in the form of a determinant for (i) two ordinary binary forms of arbitrary 
degrees (Sylvester’s determinant); (ii) multiple binary forms of arbitrary 
degrees in the variables of one set, all the forms being of the same degree in 
the variables of any other given set; and (iii) forms linear in any number of 
sets of variables, there being an arbitrary number of variables in each set. 
The form of the determinant in the third case for two sets of variables is 
different from the determinants obtained by Muir to which reference was 
made above. 


In general, we obtain more than one determinantal expression for the 
resultant as the form of the determinant occurring in the statement of The- 
orem 4 depends in a certain way upon the notation adopted. 


I. DEFINITION AND FUNDAMENTAL PROPERTIES OF THE RESULTANT 


1. Notation and preliminary remarks. Let us denote by xj, Xj, - - -, 
%;,a;41 the variables of the jth set occurring in the forms (3) (j=1, 2, - - - ,7). 
We shall henceforth let m denote the quantity 1+}°j.,0;. The degree of 
F} in the variables of the jth set will be indicated by ;; (¢=1, 2, ---, m; 

=1,2,---, 7). We assume throughout that each n;;>0; that is, each of 
the sets of variables actually appears in each form.t 


* T. W. Moore, Extended results in elimination, Annals of Mathematics, vol. 30 (1928), pp. 92- 
100; N. H. McCoy, On the resultant of three double binary forms, Ibid., vol. 33 (1932), pp. 177-183. 
We include here the following additional references which have some relation to the subject of this 
paper: A. Brill, Ueber Elimination aus einem gewissen System von Gleichungen, Mathematische An- 
nalen, vol. 5 (1872), pp. 378-396; T. Muir, Elimination in the case of equality of fractions whose numer- 
ators and denominators are linear functions of the variables, Transactions of the Royal Society of 
Edinburgh, vol. 45 (1906), pp. 1-7; K. Th. Vahlen, Ueber den Grad der Eliminationsresultante eines 
Gleichungssystems, Journal fiir die Reine und Angewandte Mathematik, vol. 113 (1894), pp. 348-352. 

Tt Op. cit. It will be understood henceforth that any reference to this author refers to this book. 

tA resultant exists under certain conditions even if this restriction is not made. Cf. Lasker, 
loc. cit., p. 106. 


4 
| 
i 
} 


218 N. H. McCOY [January 


It will be convenient at present to consider in place of the homogeneous 
forms (3) the general non-homogeneous polynomials 


(4) F; (i= 1,2, ---,m) 


obtained from them by placing x;,;41=1 (j=1, 2, - - - ,7) in each form. By 
a general polynomial we shall mean henceforth a polynomial obtainable in 
this way from a general form of the type (3). 

The totality of variables in all the various sets may be denoted by x, and 
a will indicate the aggregate of coefficients in all the forms under discussion. 
Thus ¢(a, x) will represent a polynomial in the coefficients of the set (4) and 
in certain of the variables. 

Let 7; denote the constant term in F;, that is, the term containing none 
of the variables. By [¢(a, x) ] we shall indicate the polynomial obtained from 
x) by substituting y;—F; for y; (¢=1, 2, ---, m).* If we make this 
substitution only on y; (¢=2, 3, - - - , m), we shall indicate the resulting 
polynomial by [¢(a, x) ]:. It is seen that : 


o(a, x) = [6(a, + +--+ + 


where the H’s are polynomials. This may be expressed in the usual notation, 
(5) ¢(a, x) [d(a, x) (mod Fy, F3, Fn). 


For the sake of completeness we now prove two theorems which are of 


fundamental importance. The proofs do not differ in any essential from the 
corresponding proofs in the special case r=1 but the second in particular il- 
lustrates a method of proof which is important in establishing later theorems. 


THEOREM If 
¢(a, x) =0 (mod Fi, F,), 


then actually contains all the coefficients occurring in F, or 
¢(a, x) =0 (mod Fo, F3, +--+, Fx). 
Suppose a is a coefficient in F, not occurring in ¢; then it does not appear 
in [¢(a, x) ],. Since 
o(a, x) = HiFi +--+ + 
we have 
[d(a, x) h [Hi 


* This is the Kronecker substitution. 

Konig, p. 262. 

t This relation is of course understood to be an identity in the variables x and the coefficients a. 
In particular, y(a) = 0 shall indicate that y vanishes identically in the coefficients a. 


1933] THE RESULTANT OF A SYSTEM OF FORMS 219 


But [H,],=0, as otherwise [¢(a, x) ], and consequently ¢(a, x) would contain 
a. Hence [¢(a, x) ],=0 and from relation (5) (with m replaced by k) we have 
the desired result. 


THEOREM 2.* If 
=0 (mod Fi, F2, Fx) 


where kxm—1, then (a) =0. 


It is clearly sufficient to prove the theorem for k=m—1=)0}.,0;, which 
is the total number of variables occurring in the polynomials F;. The theorem 
is seen to be true in case m=2 as in this case we have a single polynomial in 
a single variable. We accordingly prove the theorem by induction on the 
total number of variables in our polynomials. We assume the theorem is true 
for 1—1 general polynomials in »—1 variables, no matter how the variables 
are distributed among the various sets. 

Let Gi, G2, - - - , G, be general polynomials in a total of u variables, and 
let y denote any polynomial in the coefficients of these polynomials satisfying 
the relation 


y = 0 (mod Gi, Ga, G,). 


By Theorem 1, y actually contains the coefficient, say 8, of the term x?, (b>0) 
in or 

y=0 (mod Go, G3, - - , G,). 
In the latter case we have the identity, 


K.G.+---+ K,G, 
= (Ke) + + (Ky) ,=0- 


But (Ga)z,,-0, is a set of general polynomials in 
variables, and by the hypothesis of the induction, Y=0. Suppose however 
that ¥ contains 8, and when arranged according to powers of 6 let y, be the 
coefficient of 8*(s>0), the highest power of 8 occurring in y. In the identity 


i=1 


equate the coefficients of 6* on both sides. We get 


Lan + LG: +--+ + 
where Z,=0 if H; is of degree less than s—1 in 8. Place x,,=0 and we have 


* Cf. Konig, p. 263. 


# 
| 
4 
ta 


N. H. McCOY 


Vs 21 + (L,) 2; ,~0(G,) 2; 


By the argument above we find that Y, =0, which contradicts our assumption 
that 8 actually appeared in y. Hence y =0. 

2. The fundamental theorem. We have just shown that there exists no 
polynomial in the coefficients of the polynomials (4) which belongs to the 
module defined by F:, F2, - - , Fx where kS(m—1). That there does exist 
such a polynomial if k =m is shown by Theorem 3 below. Before stating this 
theorem we need to give a definition. 

By the weight of a coefficient of F ; with regard to the variables of the kth set, 
we shall mean the exponent of x; ,«,41 in the corresponding term of the homo- 
geneous form 

For convenience let us set 


Li = mah + nab +--+ + Niet, 


where the ?’s are a set of independent parameters and ;; represents the degree 
of F,;in the variables xj1, x2, - - - , Xj,2;, of the jth set. We may now state the 
following fundamental theorem. 


THEOREM 3.* There exists one and only one} rational and integral function, 
say R(a), of the coefficients of the general polynomials (4) with the following 
properties: 

(a) R(a) is irreducible; 

(b) R(a)=0 (mod Fi, Fi, F.); 

(c) R(a) is homogeneous and of degree N; in the coefficients of F; separately, 
where 


N; = coefficient of ---&’ in T]@Lit @=1,2,---,m); 
lel 


(d) R(a) is isobaric of weight W;, with regard to the variables of the kth set, where 


a, 


Wi = coefficient of ty’ - tik tai t Li (k=1,2,---,7). 
lel 


This polynomial R(a) is defined to be the resultant of the polynomials 
(4), and is thus defined only to within a numerical factor. Lemma 1 below 


* Cf. Konig, p. 271. Parts (c) and (d) of this theorem were stated by Sylvester, Proceedings of 
the Royal Society of London, vol. 12 (1862-63), pp. 674-76, or Mathematical Papers, vol. 2, pp. 
329-330. 

{ That is, if R(a) and R’(a) are two polynomials satisfying these conditions, then they differ 
by only a numerical factor. 

t This notation indicates, as usual, that / is not to take the value 7 in this product. 


220 [January 
= 1,2,---, m) 


1933] THE RESULTANT OF A SYSTEM OF FORMS 221 


shows that the resultant is determined by the properties (a) and (b) and ac- 
cordingly the remaining properties must be consequences of these. 

The theorem is known to be true in the ordinary case of one set of vari- 
ables. We have in this case r=1, m=1+a, - Mmi)/Na, 
W1=M21 - - - Mm. However we shall in the proof of the theorem only make 
use of this fact for the case of two ordinary polynomials in a single variable, 
which is the case for m=2. We now assume the theorem for m general poly- 
nomials (4) where r and a; (j=1,2,---,1r) are any positive integers such that 
m=1+)°;_,a;. We shall show that it holds for m+1 general polynomials in 
a total of m variables. 

Let 


be a set of general non-homogeneous polynomials in s sets of variables with 
8; variables in the jth set and m=)_‘_,8;. We may without confusion denote 
the variables of the jth set by xj, x2, - - - , x;,s,. Let vi;(>0) be the degree 
of G; in the variables of the jth set (¢=1, 2,---, m+1;j=1,2,---,5s). 
Further let 


= vat + + (i =1,2,---,m+1), 


m+1 


; = coefficient of in [] 1,2,---,m+1), 


— Bs m+1 


We wish to show under the hypothesis of the induction that the resultant of 
the polynomials (6) exists and is of degree WN; in the coefficients of G; and of 
weight W, with regard to the variables of the kth set. 

Before proceeding further we need three lemmas, the first two of which 
we shall state without proof as they may be readily established as in the case 
of one set of variables. 


Lema 1. (Kénig, pp. 267, 272-74.) If there exists a polynomial ¢ in the 
coefficients of the general polynomials (6) such that 

¢=0 (mod Gi, Ge, - -, Gm+1), 

then there exists one and only one irreducible polynomial R’ with the same 


property and ¢ is divisible by R’. Also R’ is homogeneous in the coefficients of 
each polynomial separately and isobaric with regard to each set of variables. 


The proof of this lemma does not depend upon the hypothesis of the in- 
duction. 


222 N. H. McCOY [January 


Lemma 2. (K6nig, pp. 274-75.) Let g(a, x) be any polynomial in the coeffi- 
cients of F, F2, +++, Fm and in the variables, of total degree d in the variables. 


Then 
Ov: 
where h(a) is a polynomial in the coefficients only and y; is the constant term in 
F;. Further, a necessary and sufficient condition that g(a, x)=0 (mod Fi, Fo, 

, Fm) is that h(a)=0 (mod F,, Fo, , Fm). 

Lemma 3.* Let G; (i=1, 2, - - - , m+1) be the general polynomials (6) and 
indicate by a the coefficient of the term in If ¥(40) is a 
polynomial in the coefficients of these polynomials and 

= 0 : (mod Gi, Go, Gm+t); 


) et, x) = h(a) (mod F,, Fa, Fn), 


then in the development of W according to powers of a, 
YW 
W, is divisible by Ri"R;’ - - - Re", where R, is the resultant of the general poly- 
nomials, » (R=1, 2,---, 5). 
Each R;, has the properties of Theorem 3 by the hypothesis of the induc- 
tion, as it is the resultant of a set of m general polynomials in a total of m—1 


variables. 


We have 
= MG, + + ++ + 


and by equating coefficients of a on both sides7 we get 


Place x,.,=0 and we have 
(Ke) 2,,=0(G2) + + (K m+1) 24;=0(Gm41) 24 :=0; 
and by Lemma 1, y, is divisible by Ri (k=1, 2,---, 5). 


Suppose y, is divisible by Ri*~ but not by Ry*~'*’. Then we may write 


l 
(8) 1, 
where 7 does not contain R, as a factor. From (7) we get 
Vil "12 


(9) [nh = Kxy x1 +++ Xa. 


* This lemma is a generalization of a theorem of Mertens. See Kénig, p. 282. The lemma is 
stated in unsymmetrical form for convenience of notation. 
t The coefficient a actually occurs in y by Theorems 1 and 2, hence ¥,#0. 


{ 


1933] THE RESULTANT OF A SYSTEM OF FORMS 223 


It may now be shown that [R,], is divisible by x: but not by xm?.* Hence 
[R]*~ is exactly divisible by xZ*~' and thus by (9), [7]: is divisible by 2;,. 
Hence by relation (5) and Lemma 1, 7 is divisible by R; or /=0. But 7 is 
not divisible by R;, thus /=0 and y, is divisible by R;*. As each of the re- 
sultants Ri, Re, - - - , R, contains coefficients not in any of the others and 
each is irreducible, it follows that y, is divisible by the required factor. 

Let us assume for the moment the existence of the resultant of the general 
polynomials (6). We can then use the resultant for the y of this lhlmma. Thus 
the degree of the resultant in the coefficients of G;(i=2, 3, - - - , m+1) can 
not be less than the degree to which these coefficients enter the product 
Ri'R;” - - - R2*. That is, the degree of the resultant in the coefficients of G; 
can not be less than 

8 m+1 
j=1 l=2 


m+1 
Bs 


= coefficient of in = -++,m+1). 
lel 


The polynomial G, played an exceptional part in the statement of Lemma 
3 but it is clear that any other one could be used in place of G;. By an argu- 
ment similar to the above we find that the resultant is of degree not less than 
WV; in the coefficients of G; (i=1, 2, - - - , m+1). We proceed to show the 
existence of the resultant and to show that its degree in the coefficients of 
G; is not greater than Nj. 

3. Existence of the resultant. It will be convenient to consider two cases 
according as 8;>1 for some j or all 8;=1. In the first case we may assume 
that Bi > 1, 

Case 1. 8,>1. Let us write the polynomials (6) in the form 


where the degrees of the variables of the first set are explicitly indicated in 
each term. For brevity let A indicate the coefficients in this set of poly- 
nomials. These polynomials may also be written in the form 


Ay A 


where B{” is a polynomial in x1, of degree vi—)>-f2,'hx. When written in 
this form we think of the polynomials as polynomials in s sets of variables, 
there being 6:—1 in the first set and 6; in the jth set (j=2, 3,---, 5s). Let 
B denote the aggregate of coefficients B\ in this set. 


* See Konig, p. 284. 


! 


224 N. H. McCOY [January 


The resultant of G:, Gz, - - - , Gm when written in the form (11) exists by 
the hypothesis of the induction. Let us denote it by Rn4i(B) or Ringi(A, *1s,) 
whenever we wish to show that it depends on the coefficients A and also on 
41g, It is seen that Rm4i(A, 0) is the resultant of the general polynomials 


(Gi)2,8;=0, 


Let us calculate the degree of Rm4i(A, in The coefficient Bx” is 
of degree vin —>-f-;'Xx in x1g,, which is exactly the weight of Ba“ with regard 
to the 6,—1 variables of the first set. Since this is true for each coefficient and 
Rm+:(B) is isobaric, it is seen by applying Theorem 3 that x1s, occurs in each 
term of Rn4i(A, x13,) to the degree Vn4:. The coefficient of this highest power 
of x1g, in Rm4i(A, Xis,) is not zero, as it is the resultant of the general poly- 
nomials obtained from G;, Ge, - - - , Gm, by replacing each Ba‘ by the coef- 
ficient of the highest power of x13, occurring in Ba‘. 
We know that 


(12) =0 (mod Gi, Ge, Gm). 
Let b; be the constant term in G; when written in the form (11), that is, }; 
is a polynomial in x:s, but contains no other variables. Denote >>; _,¥m41,% 
by p. Then by Lemma 2, we have 
0b; 
We may also write 4(B) as h(A, x13,). Now h(B) is not zero and is of the first 


degree in the coefficients of Gn4i, aS Rm4i(B) does not contain these coeffi- 
cients. Suppose /(A, x13,) does not actually contain x1s,. Then we have 


h(A, 0) =0 (mod Gi, Go, Gm), 


(13) 


) == h(B) (mod Gi, Ge, Gm). 


and by Lemma 1, the resultant of our forms (10) exists and is of at most the 
first degree in the coefficients of Gn4:. But we have shown above that the 
resultant can not be of degree less than W n+, in these coefficients and Nn4i121. 
Hence V4; must in this case be equal to 1 and the resultant is of degree 
in the coefficients of Gn4:. 

Suppose then that h(A, x1s,) actually contains 2x3,. It may now be shown 
that Rn4i(A, x1s,) and h(A, 21,) have no common factor other than a numer- 
ical constant.* Let S,.4:(A) be the ordinary resultant of these two as poly- 
nomials in xig,. Then Sn4i(A) #0 and 


Smii(A) = 0 (mod h(B)), 
or by (12) and (13), 


* See Konig, p. 281. 


i 


THE RESULTANT OF A SYSTEM OF FORMS 225 


Smsi(A) =0 (mod Gi, Go, Gm41). 


Lemma 1 again establishes the existence of the resultant of the poly- 
nomials (10) and we know that S,,4:(A) is divisible by this resultant. Since 
Rmi(B) is of degree Nn4i in 21g, and h(A, x1,) is linear in the coefficients of 
Gmn+i, Sm4i(A) is of degree Nn4: in the coefficients of G41. Thus the degree 
of the resultant in the coefficients of Gn4: is not greater than Nii, and by 
the result of §2 this degree is not less than Nn4:. Hence this degree is exactly 
Nm+i1 aS we wished to show. 

By a similar argument it may be shown that the degree of the resultant 
in the coefficients of G; is exactly NW; for each 7. 

Case 2. 8;=1 (j=1, 2, - - - , s), m=s. There is only one variable in each 
set; let us denote them by 41, ya, - - - , y, respectively. The polynomials may 


be written in the two forms 
(i) Ay Ae » 


G; = Ve 

where is of degree vj in 
Let M; denote the coefficient of fet; - - - ¢, in 


+ vjsts + --- + viets). 

j=l 
Now form the resultant of Gi, Ge, - - - , Gm as polynomials in ye, ys, - - - , Vs, 
say Rwii(B)=Rulii(A, y:). Then by the hypothesis of the induction, 
Riil(A, y:1) is of degree 


in y;. But M is the coefficient of t; fe - - - ¢, in 


(vith + vite + + vjets); 

j=l 
and thus M =W,,,:. From this point we may proceed exactly as in the previ- 
ous case and the details will be omitted. 

Before considering the proof of part (d) of Theorem 3 it is desirable to 
pass back to homogeneous forms. 

4. The resultant of homogeneous forms. Let us make our general poly- 
nomials (6) homogeneous in each of the s sets of variables by introducing 
new variables, 2;=%;,8,4: (j=1, 2, ---, s). These homogeneous forms will 
be denoted by 


(14) Gi 1). 


1933] 


226 N. H. McCOY [January 


Let R indicate now the resultant of the polynomials (6). Then we have 


m+1 
R = 


i=1 
This goes over in the homogeneous case to 


m+1 

i=1 
where p; is the weight of R with regard to the variables of the kth set. We 
show below that p.=W, but for the present its value is immaterial. 

In let us place %1;,=%2;,= - =%s,;,=1, where these are any vari- 

ables of the respective sets, and denote the resulting non-homogeneous poly- 
nomials by G;‘?). Then from (15) we have 


m+1 
20 coe, A: G;. 


i=1 
By the Kronecker substitution we see that [R] =0 and thus 


R=0 (mod G;’, G:’, , Gays). 
If we denote by R“ the resultant of the polynomials G;‘*”, then this relation 
shows that R is divisible by R‘ and by reversing the process we see that 
R‘® is divisible by R. Thus we get the same resultant, defined only to within 
a numerical factor, no matter which one of the variables of each set in G; 
we place equal to unity. The resultant R of G; (¢=1, 2, - - - , m+1) we ac- 
cordingly define to be the resultant of the homogeneous forms (14). 

Let G/ (i=1, 2, - - - , m+1) denote the set of forms (14) after we have 
made a general non-singular linear transformation on the variables of say 
the first set, and let R be the resultant of this transformed set of forms. It 
follows immediately* that 


(16) R = RU, 


where U is a form in the coefficients of the transformation only. As a matter 
of fact U must be a power of the determinant of the transformation.t As a 
special case we see that the resultant is unchanged if the variables are per- 
muted in any way within the set in which they occur. 

We now prove the following: 


* K6nig, p. 293. 
t See Bécher, Higher Algebra, p. 220. Thus the resultant is an invariant under independent 
linear transformations of the various sets. 


1933] THE RESULTANT OF A SYSTEM OF FORMS 227 


4. Let %1;,, %2j,, °° * %sj, be any variables of the respective sets in 
G!. Then the resultant of the forms G{ contains the coefficient of 


Yks 
* 


in Gi to the degree 


In view of the previous remarks it will be sufficient to prove this lemma 
for the case where j,=8,+1 (r=1, 2, - - - , s). For convenience of notation 
let us consider the case k =2. Let az denote the coefficient under considera- 
tion, that is, az is the constant term in Gz. By Lemma 3, az enters the resultant 
to a degree as great as the degree to which it enters Rj" R;"--- R?", 
which by the use of induction on the number of variables is 

8 m+1 

coeficient at... Tah rat) | 
k=1 
But this is V2. The proof is unchanged for any k~1. If k=1, we need only 
to change the way in which Lemma 3 has been stated for simplicity of nota- 
tion. 

It now follows from Lemma 3 by a consideration of the degrees that the 
resultant of the forms (14) contains the term 


"1s 


(17) a Ry Ro 9 R, 


"ls 


with at most a numerical coefficient. Here a is the coefficient of «/"xi? - - - x¢{ 
in G,. We may determine the weight of this expression (17) with regard to the 
kth set of variables by the hypothesis of the induction and it is found to be 
Wx. Since by Lemma 1 the resultant is isobaric, each term of the resultant is 
of weight W, with regard to the &th set of variables. This proves part (d) of 
Theorem 3 and thus completes the proof of the theorem. 


II. THE RESULTANT IN DETERMINANT FORM 


5. We now pass to the problem of expressing the resultant in determinant 
form in certain special cases. 
Let 


(18) $i 


be a set of general forms homogeneous in each of s+#=r(r2=1, s, #20) sets 
of variables, there being a;+1 variables in the jth set (a;21,7=1,2,---,7r). 
We assume further that a;=a2= - - - =a,=1,and hence m=s+1+),/,,0;. 
Also the degrees of these forms in the various sets are assumed to be those 
given in the following table: 


. McCOY [January 


(s+1) (8+2) 


Here the degree of ¢; in the variables of the jth set is found at the intersection 
of the ith row and jth column. The numbers mj; (j=1, 2, - - - , m), m2, ms, 
- +, m, are arbitrary positive integers. 

We shall express in determinant form the resultant of this system of 
forms. As a special case if 4=0, we have a set of multiple binary forms, the 
degree in the variables of the kth set being the same for each form if k>1. 
If, further, r= 1, we have two ordinary binary forms of arbitrary degrees and 
our form of the resultant reduces to the Sylvester determinant. If on the 
other hand s=0, we have a set of forms linear in each of ¢ sets of variables, 
there being an arbitrary number of variables in each set. 

We now state the principal result of this section. 


THEOREM 4. Let o; (=1, 2, - - - , m) be the general forms (18) and con- 
sider all possible equations of the type 


(20) Pidi = 0 


Here p; represents a power product of the variables of such a degree that pid; ts 
homogeneous of degree >." \ni—1 in the variables of the first set; of degree 
(m—k+1)n,—1 in the variables of the kth set (k=2, 3,---, s); of degree 
in the variables of the (s+l)th set (l=1, 2,---+,t-1); 
and of the first degree in the variables of the rth set.* Considering these power 
products of the variables as unknowns, we have in the set (20) the same number 
of equations as unknowns and the determinant of the coefficients of the un- 
knowns is the resultant of the given forms. 


Let us calculate, for example, the number of the equations (20) arising 
from ¢;. Making use of the fact that the number of terms in a general poly- 
nomial, homogeneous of degree N in one set of M variables, is 


N NUM —1)!_ 


we see that the number of equations arising from ¢; is 


* That is, ps does not contain the variables of the rth set if ¢ > 0. 


{ 
228 
(1) (2) 
oi | mu NM, Ms 1 1 1 
(19) | Mn Ne Ms 1 1 1 
Pm | N2 1 1 1 
(i = 1,2,---,m). 


1933] THE RESULTANT OF A SYSTEM OF FORMS 229 


(x — 2)(m — n.( 


Qr-1 


Remembering that m=s+1+) {,,a;, this may be reduced to the form 


i=2 


t=2 


The total number of the equations (20) is found to be 


(22) ( ma) mans + my[(m — a,!), 
t=1 

and a calculation similar to the above shows that this is also the total number 

of unknowns. We thus have the same number of equations as unknowns. 

Let D be the determinant of the coefficients of the unknowns in the equa- 
tions (20). Assume for the present that D0 for general forms (18) which 
we are considering. Let 1, B2, - - - , 8, denote the degrees of the power 
products in the several sets of variables in equations (20) and suppose that 
the elements of the last column of D are the coefficients of 

B2 8, Br 

in these equations. Multiply each column of D by the power product of 
which its elements are coefficients and add to the last column. Each element 
of the last column is now of the form p,¢;. Hence if we expand D in terms of 
the elements of the last column we get 

Deiat - =0 (mod oi, $2,°°*, Gm). 
From the discussion in §4 and Lemma 1 it follows that D is divisible by the 
resultant of the given forms. 

Now D is clearly homogeneous of degree given by the expression (21) in 
the coefficients of ¢:. Let us calculate by Theorem 3 the degree of the re- 
sultant in these coefficients. This degree is the coefficient of tile - - - ts 
fin 


+--+ + mts +4). 


But this is 


N. H. McCOY [January 


( > ns) coefficient 


i=2 


(Mote +++ Mote 


= ( na) ns n,|(m a,!). 


Thus the degree of D in the coefficients of ¢; is the same as the degree of the 
resultant in these coefficients and similarly for each ¢;(i=2, 3, - - - , m). As 
D contains the resultant as a factor, D must be the resultant provided D40. 

We proceed to show that D40 by a process of induction. We assume The- 
orem 4 for the case of the proper number of forms of the general type (18) 
in fewer variables. It is known to be true for the case of two ordinary binary 
forms. 

Let w; (i=1, 2, ---) represent the power products of the variables oc- 
curring in the equations (20), that is, the power products of the degrees 
mentioned in the statement of Theorem 4. We first of all specialize ¢, by 
placing 


Ns 
(23) oi = = Xe °° * 


Then in each row of D arising from ¢/ we have one and only one element 
different from zero and it is equal to unity. The columns of D in which a 1 
thus occurs are those whose elements are coefficients of an w; which is di- 
visible by ¢/. Let us strike these rows and columns from D and denote the 
remaining determinant by D’. Thus D’= +D. 

The power products w; not divisible by ¢{ may be arranged in mutually 


exclusive sets as follows. Let w‘” denote those which contain x exactly to 


the degree g (¢=0,1,--+, mu—1); those divisible by - 2374 
and containing x,; to the degree g (¢=0, 1,---, mp—1; p=2, 3,---, 5); 
Xen (R=1, 2,--+, 27-5). Each power product w; falls into one and only 
one of these sets or is divisible by ¢/. 

Consider now the set of all power products p multiplying ¢2, - - - , dm, 
in equations (20). These are of various degrees. The same power product may 
occur multiplying different forms; in this case we count it as many times as 
it appears. We suppose further that these p’s are so labeled that having given 
a particular p we know which form it multiplies in the equations (20). Thus 


specifying a given p designates a row of D’. We now define p™, Py and 


psx by the same conditions used in defining w, w’ and w,,, respectively. 
y , + y 


230 


1933] THE RESULTANT OF A SYSTEM OF FORMS 231 


In particular p is the set of p’s which contain x to the degree g (q¢=0, 1, 


- ,M,—1), and so on. To get this set we first select those multipliers of 
¢2: with this property, then those multiplying ¢;, and so on, each power 
product being taken as many times as it appears. Each p falls into one and 
only one of the above sets and none of them is divisible by ¢/. 

A direct calculation shows that the number of elements in the sets 
pi”, Pp, Ps-x is exactly the number of power products w in the sets w,“, 
wy and w,,, respectively. By a proper arrangement of rows and columns 
we may therefore write +D’ in the form 


(m11-1) 


D, 


In this arrangement, the elements falling in the square array D, are 
those in a row of D’ denoted by a p of the set pi and in a column designated 
by an w of the set w:", and so on. We suppose the order of the sets of w’s 
from left to right is 


(0) (0) 
5°°°? 


(0) 
on ql) 
(n11—-1) 
1 
eae 
Dy 
1 
1) 
(n11-1) 
| | 
Pr 
( 
Ws » °° 
» Wr 


232 N. H. McCOY [January 


The p’s are arranged in the same order from top to bottom. Now there is no 
non-zero element vertically below any of the square arrays D. For example, 
consider the set of columns w,“ (p>1). Every set of p’s following p,“ con- 
tains 
M11 ne qtl 
as a factor, while w,‘” does not contain this term as a factor. This observation 
is all that is necessary to obtain this result. 
Thus we have 
where these denote the determinants of the arrays indicated. 
Let S; denote the resultant of 


(2) 241-0, (bm) 


with the understanding that if a, =1, asis certainly the case fork =1,2, - - -,s, 
we also place x+,2=1. It may now be shown that 


= S (q=0,1,---,m — 1), 
(q) 


(24) = S, (p = 2,3,---,8; g=0,1,---, 1), 
Doze = Sore (k = 1, r—s). 


We make the calculation for a typical case, say D® for convenience. Denote 
by ¢:, ---, @m the general forms obtained from ¢2, - -, dm by placing 
X41 =0, *.2=1 in each form. Then apply Theorem 4, as it is true by the hy- 
pothesis of the induction. We shall use the notation as above, for example 
the rth set of variables will denote those variables which belonged to the 
rth set in the forms (18) although it is only the (r—1)st set here, as the sth 
set is lacking. We have then equations of the type 


(25) nidi = 0 L 


where 7.9; is of degree >.” —1 in the variables of the first set; of degree 
(m—k)n,—1 in the variables of the kth set (k=2, 3, - - - , s—1); of degree 
+a,+1 in the variables of the (s+/)th set (J=1, 2, - - - ,¢—1); 
and of the first degree in the variables of the rth set. But we obtain exactly 
these power products occurring in the equations (25) if we divide those of 
the set w,“” by their common factor, 


| = 1,2,---,7) 


1933] THE RESULTANT OF A SYSTEM OF FORMS 233 


Similarly we obtain all these power products 7; in (25) by dividing those of 
the set p,“ by this same factor. 

Now the determinant of the coefficients of the unknowns in the equations 
(25) is S,. Multiply these equations by the common factor of the elements of 
w, and we have the equivalent system of equations 


The determinant of the coefficients in these equations is D as the power 
products in this set of equations are exactly those of w”, and D™ is seen 
to contain no coefficient not occurring in the forms ¢2, - - - , ém. Thus we 
see that D{@ =S,. In a like manner the other relations (24) may be verified. 
We have then that 


and no one of these factors is zero. Thus for general forms D is not zero. 
This completes the proof of Theorem 4. 

The form of the determinant D obtained for a given system of forms 
(18) clearly depends upon the convention as to which set of variables is the 
second, which the third, and so on. Thus in general we have a variety of 
determinantal expressions for the resultant of forms of the type (18). 


SmitH COLLEGE, 
NORTHAMPTON, Mass. 


AN AXIOMATIC BASIS FOR PLANE GEOMETRY* 


BY 
STEWART S. CAIRNS 


1. The axioms. The fourth appendix of Hilbert’s Grundlagen der Geo- 
metrie{ is devoted to the foundation of plane geometry on three axioms per- 
taining to transformations of the plane into itself. The object of the present 
paper is to attain the same end by quicker and simpler means. The simplifica- 
tions are made possible by using orientation-reversing transformations{ and 
changing Hilbert’s second and third axioms. 

The (x, y)-plane will mean the set of all distinct ordered pairs of real 
numbers. The terms of analytic geometry, to which no geometric content 
need be given, will be used, modified by the prefix (x, y) where ambiguity 
might arise. Thus we shall refer to (x, y)-distance, (x, y)-lines, and so on. 

The general plane, p, will be any set of objects, called points, which can 
be put in one-to-one correspondence with the points of the (x, y)-plane. For 
convenience, we shall speak of the points of p as if they were identical with 
their images under such a correspondence. 

The following axioms pertain to a set, T, of continuous§ one-to-one trans- 
formations of # into itself. A transformation of the set which leaves two dis- 
tinct points fixed and reverses orientation will be called a reflection. 


Axiom 1. The transformations T form a group. 


Axiom 2. If A and B are two points of p, T contains a reflection leaving 
A and B fixed. 


Axiom 3. Let T4 denote the subset of T containing all the transformations 
thereof which leave A fixed. If T4 contains transformations carrying pairs of 
points arbitrarily near a given pair of points (B, C) into an arbitrarily small 
neighborhood of a pair (D, E), then T 4 contains a transformation carrying (B, 
C) into (D, E). 


2. The curve y. Our first object is to establish the following theorem, 
which, like Lemma 1 below, is similar to a result employed by Hilbert 
(loc. cit.). 


* Presented to the Society, September 9, 1930; received by the editors July 10, 1932. This paper 
was partly written while the writer was studying under Professor B. von Kerékjart6 at Szeged Uni- 
versity as a Travelling Fellow from Harvard University. 

t D. Hilbert, Grundlagen der Geometrie, 1930, pp. 178-230. 

¢ Suggested by Hilbert, loc. cit., p. 182. 

§ That is, continuous in terms of (x, y)-distance. 


234 


AXIOMS FOR PLANE GEOMETRY 235 


THEOREM 1. Every neighborhood of A contains a simple closed curve, y, 
enclosing A and preserved by each of the transformations T 4. 


We shall assume the Jordan separation theorem and the following con- 
verse thereof: 


(A) A locally connected* set of points which divides the (x, y)-plane into 
two regions, one of them finite, and forms their common boundary, is a simple 
closed curve.{ 


Lemma 1. For any given positive €, there is a neighborhood, N, of A on p, 
no point of which is carried to a distance ¢ from A by any of the transformations 
T, (Axiom 3). 


Otherwise, let P; ({=1, 2,---) be a point within distance ¢€/2‘ of A, 
whose image, Q;, under one of the transformations TJ, is at distance ¢ from A. 
Then 7, contains transformations carrying points arbitrarily near A into 
an arbitrarily small neighborhood of any cluster point, Q, of (Qi, Qe, - - - ). 
Therefore, by Axiom 3, T4 contains a transformation carrying A into Q. But 
this is impossible, for the transformations T, all leave A fixed. 


Lemma 2. Let c be any simple closed curve on N enclosing A. The set, T, of 
all points into which points on c are carried by the transformations T 4 is closed. 
The transformations T4 all preserve T. 


Any cluster point, P, of T is limit of some series (P;, Po, - - - ) on I. By 
definition of I’, one of the transformations 7, carries a point Q; (¢=1, 2, -- - ) 
on c into P;. Hence, if Q is a cluster point, necessarily on c, of (Qi, Qe, - - - ), 
T. (see Axiom 3) contains transformations carrying points arbitrarily near 
Q into an arbitrary neighborhood of P. Therefore (Axiom 3), T4 contains a 
transformation carrying Q into P. Hence P is on I, and T is closed. 

Consider the image, P’, of any point, P, on T under any transformation, 
T;, of the set T4. Let To be one of the transformations T,4 carrying some 
point, Q, on ¢ into P. Then 757; carries Q into P’. Since ToT; leaves A fixed 
and belongs to T (Axiom 1), it belongs also to T4. Therefore P’ is on I’. This 
completes the proof. 


* A point set, S, is said to be locally connected if, for any e>0 and any point P, of S, there exists 
a positive distance, 5, such that all points of S within distance 6 of P are connected with P by a sub- 
set of S entirely within distance ¢ of P. 

} Essentially in this form, the theorem is given by J. R. Kline, these Transactions, vol. 21 (1920), 
p. 452. It is a ready consequence of Hahn’s characterisation of continuous curves, Jahresbericht der 
Deutschen Mathematiker Vereinigung, vol. 23 (1914), p. 318, together with a theorem by R. L. 
Moore, Bulletin of the American Mathematical Society, vol. 23 (1917), p. 233, that any two points 
of a continuous curve, S, can be joined on S by a simple Jordan arc. 


236 S. S. CAIRNS [January 


Lemma 3. The complement of T on the (x, y)-plane contains just one un- 
limited region, R. The boundary, y, of R divides the (x, y)-plane into two regions, 
R and Ro, and forms their common boundary. 


The first part of the lemma follows from Lemmas 1 and 2. It also follows 
from these lemmas that y ison I’. Let P denote any point neither in R nor 
on y, and & a simple arc through P with just its end points, P; and P», on y. 
Some transformation, T; (j =1, 2), of the set J, carries a point, Q;, of c into 
P; (Lemma 2). A simple arc inside c joining A to Q; is carried by 7; into an 
arc, k;, joining A to P; but not meeting either R or y. Because R is connected, 
(k+ki+k:2) cannot enclose any point of either R or its boundary, y. Therefore 
cannot separate P from A. Hence all points neither in R nor on y are ina 
single region, Ro. By such a curve as k;, any point on y can be joined to A 
inside Ro. Therefore, y is the common boundary of Rp and R,. 


Lemna 4. The boundary y is locally connected. 


Suppose that at some point, X, y is not locally connected. Then a positive 
number, d, exists, so small that every neighborhood of X contains points on 
not connected with X by any continuum on y entirely within distance d of X. 
Let P; (i=1, 2,---) be one such point within distance d/2‘ of X. Let C 
denote the («, y)-circle of radius d about X and K; the set of all points con- 
nected with P; on y inside C. Then, if K; and K; have a point in common, 
they coincide. It may readily be seen that K; contains all its cluster points 
inside C. Therefore, at most a finite number of the K’s can coincide with any 
one of them. Otherwise, an infinite subset of (P:, Ps, - - - ) would belong to 
one of the K’s, which would therefore contain X and join it in C to certain 
of the P’s. Hence, with no loss of generality, we may assume that the K’s 
are all distinct.* 

Let C; be the circle of radius d/2 about X, and K} (i=1, 2, - - - ) a closed 
connected subset of K; joining P; to C; but containing no points outside C. 
Let C; be the circle of radius d/4 about X. Without loss of generality, we 
assume* that K/ contains a point, Q;, on C; and a point, S;, on C, such that 
(Q:1, Qe, ) converges to a limit, monotonically on the arc Q,0.0, and 
(Si, Se, - - - ) converges to a limit, S, monotonically on the arc S$,S,S. Con- 
sider, for any i>1, a simple closed curve made up of two arcs, &, and ke, 
joining A to S; inside Ry (Lemma 3, proof). Suppose &: meets the arc 
on Ci, but not the broken line whereas ke 
meets 8; but not a;. Then (4:+2) separates S;, from S;4:. For, let a simple 


* To avoid excessive notation, we assume for the K’s several properties enjoyed by some infinite 
subset thereof. 


1933} AXIOMS FOR PLANE GEOMETRY 237 


closed curve, yo, be formed by adding to a; and 8; a pair of arcs joining Q;1 
to P;_, and Qi; to Pis:, respectively. Let these latter curves pass through 
S;, and S;4:, respectively, and lie so near to K i-1 and Kj4, that yo encloses 
S; and is met by &; only on a; and by & only on 8;. Then (4:+42) clearly 
contains just one arc inside separating from Therefore and 
Si41 are separated by the closed curve (41+4:2). But this is impossible, for 
no curve in Ry can enclose points of R (Lemma 3). Therefore, if c; is a simple 
closed curve in Ry through A and S;, then a; (or B;) meets both of the arcs 
into which c; is divided by A and S;. 

Now S; (¢=1, 2,---), being on I (Lemmas 2, 3), is image, under a 
transformation, T;, of the set T4, of some point, Z;, on the curve c of Lemma 
2. Let c’ be a simple closed curve through A which contains no points outside 
c but has in common with c an arc through a cluster point, E, of (Zi, E2, - - - ). 
With no loss of generality, we assume* that all the EZ’s lie on c’ and that 
(Ei, Ex, - - - ) converges to E. Then 7; carries c’ into a simple closed curve, 
c;, to which the conclusion of the preceding paragraph applies. We shall 
treat only the case where both the arcs AS; (t=1, 2, - - - ) on c; meet a;.f 
In this case, c’ passes through two points, E/ and E/’, separated on c’ by 
(A, E,), where the images of (Z/, E/’) under T; are on a;. Without loss of 
generality, we assume* that (EZ/, Ey, ---) and (E/’, Ey’, - - -) converge 
to a pair of points, E’ and E’’, respectively. Now, by definition, a;, for 7 large 
enough, is in an arbitrarily small given neighborhood of Q. Hence, since 
T; carries (E/, E/’) onto a;, T,4 contains a transformation (Axiom 3) 
carrying (Z£’, E’’) into Q. Hence E’=£”’. Since A and E; separate E/ and 
E}’ onc’, E’ and E” can coincide only at A or at E. But A cannot go into Q 
under any of the transformations T,. Hence E’= E’’=E. Then, for z large 
enough, 7; carries a pair of points (Z;, Z}) arbitrarily near Z into an arbi- 
trary neighborhood of the pair (S, Q). Therefore (Axiom 3) 7, contains a 
transformation carrying E’ into (S, Q). This contradicts the one-to-one-ness 
of the transformations and establishes the lemma. 

Theorem 1 above is an immediate consequence of (A) together with Lemmas 
2,3, and 4. 

3. Lines and reflections. The set of all fixed points under a reflection 
(see §1) will be called a dine. 


* See footnote on p. 236. 

+ To show this, c’ may be slightly deformed, if necessary, so that its image meets 7 only at 5;. 

t The method applies equally well if both arcs meet 6;. We need only replace a by 6 and Q by P. 
In assuming that the arc AS; meets a; (or ;) for all values 7, we employ the convention stated in 
the footnote on p. 236. 


238 S. S. CAIRNS [January 


Lemma 1. An orientation-preserving transformation, 7, of the group T which 
preserves a simple closed curve, y, and leaves one point of y fixed, leaves every 
point of y fixed.* 

If A denote the known fixed point on y, then 7 and all its powers belong 
to the set T, (Axiom 3). Let P; be the image of Po under the ith power of r. 
Ascribing a positive sense to y, consider the arc AP». If it passes through P,, 


then the arc AP;, being the image under 7 of AP», passes through P2; and, in 

general, AP; (i=2, 3, - - - ) passes through P;4:, but not P;_;. On the other 


hand, if PoA on y passes through P;, then P.A contains P;,,; but not Pj1. 
Thus, in either case, (P:, Ps, - - - ) is a monotonic series on the curve y. If 
P is its limit, then, for 7 sufficiently large, the two points (P;, P;,:) are in an 
arbitrary preassigned neighborhood of P. But these two points are images, 
under r‘, of (Po, P:). Therefore (Axiom 3) some transformation of the set T, 
carries both Py and P; into P. This implies that P, and P; coincide, and hence 
that every point of C is fixed under r. 


CoroLiary. The transformation r is the identity. 


First, since r is continuous, the set, S, of its fixed points is closed with re- 
spect to p. Suppose S does not coincide with #, and consider the largest con- 
nected subset, S,, of S which contains C. Let & be a simple curve joining a 
point, Q, of (p—S) to a point, B, of S:, but not containing any point of 
(S:—B). Let y’ be a simple closed curve about B which meets both & and S; 
and which is carried into itself by tr (Theorem 1). Since y’ meets Sj, all its 
points are fixed under 7 and hence belong to S;. This contradicts the definition 
of & and thus establishes the corollary. 


Lemma 2. A simple closed curve which is preserved by a reflection, p, is met 
in just two points by the line which p defines. 


For, an orientation-reversing transformation which preserves a simple 
closed curve leaves just two points on the curve fixed. 


Lemma 3. The identity is the only orientation-preserving transformation of 
the group T which leaves every point of a line, L, fixed. 


This follows from the preceding results of this section applied to the curve 
of Theorem 1, where A is on L. 


THEOREM 2. A reflection, p, is involutory. No two different reflections define 
the same line, L. 


* The proof is patterned after one by Hilbert, loc. cit., pp. 204, 205. 


1933] AXIOMS FOR PLANE GEOMETRY 239 


By Lemma 3, pp is the identity. Also, if p and p’ both define L, pp’ is the 
identity. Therefore p=p’. 


CoroLiary. A reflection preserves the set of all lines. 


Let LZ; be any line and J, its image under an arbitrary reflection, p. Let 
pi be the reflection defining Z;. Then, since reflections are involutory, ppip 
leaves just the points on L, fixed and reverses orientation. Therefore ppip is a 
reflection and J, a line. 


4. Properties of the line. (A) Let p be a reflection and (Pi, P2) a pair of 
points interchanged by p (Theorem 2). Then any simple Jordan arc, k, which 
joins P, and P, meets the line, L, defined by p. 


Since p is involutory (Theorem 2), it interchanges & with its image, k’. 
As a point, Q, traces k from P; to Ps, its image, Q’, under p traces k’ from 
P, to P;. As the arc P,Q on & increases, Q reaches a first position, Q;, which is 
on the image, under p, of the arc P,Q, end points included. Let Q2 be the 
image of Q,. If Q, is not on L, the arcs 0,0, on & and k’, respectively, have only 
their end points in common. Since p interchanges these arcs and reverses 
orientation, it must leave Q; and Q, fixed. Therefore Q,(=Q2) is on L. 


Lemma. A line, L, is locally connected. 


Under a contrary assumption, let A denote a point at which Z is not lo- 
cally connected; and let ¢ denote an (x, y)-circle about A, so small that every 
neighborhood of A contains points of Z not connected with A on L inside c. 
In particular, consider the neighborhood N of Lemma 1 in §2, where ¢ is the 
radius of c. Let k be a simple arc in N, with only its end points on L, joining 
two points not connected on L inside c. Then, by (A), & and its image, k’, 
under the reflection defining L are distinct and are separated by the points of 
L inside (k+k’). Hence these points of Z join the end points of k; but this is 
contradictory, since, by definition of V, (k+’) is entirely inside c. 


(B) If cis a simple arc with just its end points on a given line, L, then c and 
its image, y, under the reflection, p, defining L, form a simple closed curve. The 
points of L inside (c+) constitute a simple Jordan arc joining the common 
end points of c and y. 


From (A), (c+) is a simple closed curve. Let R be the set of all points 
which can be connected with ¢ by a simple arc inside (c+) containing no 
points of L. Let \ denote the set of points common to Z and the boundary of 
R. The image, R’, of R under p obviously consists of all the points which can 
be connected with y inside (c+) without meeting L. By (A), R and R’ are 
distinct. Therefore \ separates c from y and connects their common end 
points. Also, (R+A+R’) contains all points inside (c+), for otherwise some 


240 S. S. CAIRNS [January 


point would belong to a finite region bounded solely by points on L, and this 
region would go into itself under p in contradiction with (A). Now (¢+)) 
divides the (x, y)-plane into two regions, the finite region R and the region 
consisting of R’ plus y plus the exterior of (c+); and (c+) is the common 
boundary of these two regions. Hence, using the above Lemma and §2(A), 
(c+\) is a simple closed curve, and ) is an arc thereof. 


(C) A line, L, is homeomor phic with the x-axis. 


Let c; be a simple arc with just its end points (A,, B,) on L, and let Q be 
a point of c. Then c plus its image, y:, under the reflection, p, defining L 
cuts from L a Jordan arc, \; (see (B)). Let \; be put in continuous one-to-one 
correspondence with the segment —1<* <1 on the x-axis, A; corresponding 
to —1and B, to +1. 

Proceeding inductively, for i=1, 2, - - - , let ci4; be a simple arc which 
passes through Q, has only its end points on L, and lies outside (c;+7;). 
Further, let c;,; pass through no point within some positive preassigned dis- 
tance, d, of the points A; and B;. The curve c;,,; plus its image, i4:, under p 
cuts from LZ a simple arc, A;4:, containing \;. Let A;4: be put in continuous 
one-to-one correspondence with the interval —(i+1)<x<(i+1) on the 
x-axis in such a way as to preserve the correspondence of \; with the interval 
—isxsi. Let (Aisi, Bix:) be the end points of \;4: which correspond to 
—(i+1) and (+1), respectively. By the last condition imposed on ¢;4:, 
neither of the series (Ai, As, - - - ) and (B,, Be, - - - ) hasa cluster point on p. 
Hence the above inductive process leads to a continuous one-to-one corre- 
spondence between the x-axis and a portion, L’, of L where L’ divides p into 
two parts. Now the set of all points each inside at least one of the curves 
(¢;+y;:) (¢=1, 2, - - - ) is a neighborhood of L’ free from other points of L. 
Hence L’ and (L—L’) are distinct. If the set (L—L’) is not vacuous, let c’ be 
an arc with just its end points on L, one end point being on L’ and one on 
(L—L’). By (B), the end points of c’ are joined on L by a simple arc. This 
contradiction establishes that L’=L. 

(D) If two lines, L and L’, through any point, A, have any other point, B, 
of the neighborhood N (§2, Lemma 1) in common, they coincide. 

Let yo be a curve about A satisfying Theorem 1 and not enclosing B. 
Suppose vo passes through no common point of Z and L’. As the arc AB on 
L is traced from A, let B’ be the first point reached outside yo and on L’. 
Then B’ is joined to yo by two distinct arcs, k and k’, on L and L’, respec- 
tively. Let c be the simple closed curve about A formed of k, k’ and an arc on 
vo. Using this for the curve c of Lemmas 2, 3, etc., in §2, we are led to a 
curve, y, satisfying Theorem 1. This curve passes through B’. Suppose it 


1933} AXIOMS FOR PLANE GEOMETRY 241 


does not. Then A and B’ are both inside y; and, since ZL and L’ meet y each 
in just two points, the arcs AB’ on L and L’ respectively, are both inside y. 
Hence c and y meet only on yo. But then* y =, which is impossible. Since, 
therefore, y passes through the common point B’ of L and L’, the product of 
the reflections defining L and L’ is the identity (§3, Lemma 1 and Corollary), 
and L=L’ (Theorem 2). 

5. Further properties. The connection with euclidean plane geometry. 
The remaining developments prepare for a complete deduction of euclidean 
plane geometry. 


Lemma 1. Let (Lo, L;) be two different lines through any point, A, and let 
(A;, B;) (7 =0, 1) be the two points (§3, Lemma 2) in which L; meets the curve y 
of Theorem 1. Then the points (Ao, Bo) separate A; from B, on vy. 

Suppose the contrary, and let y’ denote the arc A By on y containing A 
and B,. Under the reflection defining Z;, Zo goes into a line, Z2, which meets y 
in the images (Az, Be) of (Ao, Bo), and, by §4(A), both Az and B, must be on 
the arc A,B, of y’. Proceeding inductively fori=1, 2, - - - , let Leis; and Leiss 
be the images of Z; and Lo, respectively, under the reflection, p2;, defining 
Let (A 2i+1, A242) be the images under of (Ay, Ao) and (Boiss, Boi+2) the 
images of (B,, By). Then the arc A,B; on y’ contains A ;,; and B;,;. Thus the 
series (A;, As, - - - ) converges monotonically on 7’ to a limit, X. Hence, for i 
large enough, Aoi: and Asi+2 are within an arbitrary given distance of X. 
By Axiom 3, since p2; belongs to 74, T4 contains a transformation carrying 
(Ao, A1) into X. But our transformations are all one-to-one. This contradic- 
tion establishes the desired result. 

Let (A, P) denote any pair of distinct points on the plane p. The set of all 
images of P under reflections leaving A fixed will be called a circle, with A as 
center. 


THEOREM 3. A circle, K, is a simple closed curve. 


(I) Suppose the point P of the above definition is in a neighborhood, JN, 
of A satisfying Lemma 1 in §2, so that K is in a finite region of the (zx, y)- 
plane. Each point, P’, on K is image of P under just one reflection leaving A 
fixed. For suppose there were two such reflections (p:, p2) carrying P into P’. 
Let p be the reflection defining the line AP’. Then pipp2 and pipp; are both 
reflections leaving A and P fixed. Hence they both define the unique line 
(§4(D)) through A and P. Therefore pipp2=pipp: (Theorem 2), or pi=pe. 

Using the notation of Lemma 1 above, let y’ be one of the two arcs A,Bo 
on y. As a point, Q, traces y’ from Ao to Bo, the line through A and Q adopts 


* Since vy is the set of images under T 4 of points common to ¢ and y (Theorem 1, and Lemmas 
2, 3 in §2), and T 4 preserves ‘Yo. 


242 S. S. CAIRNS . [January 


the position of every line through A once and only once (Lemma 1), except 
that the same line is obtained for Q= Ao as for Q2= By. To eliminate the ex- 
ception, regard A and By as identical, so that Q traces essentially a simple 
closed curve. Then the image, P’, of P under the reflection defining the line 
AQ adopts every position on K once and only once as Q traces ’. This affords 
a one-to-one correspondence between the points on K and those on vy’, the 
point A o(=B>) included. It remains only to show the continuity of this 
correspondence. Let Q be any point on y’ and (Qj, Qe, - - - ) a series of points 
converging to Q. Let P; (¢=1, 2, - - - ) be the point on K corresponding to Q; 
and let P® be any cluster point of the P’s. Then there are transformations 
leaving A fixed, carrying points arbitrarily near Q into themselves and carry- 
ing P into an arbitrary neighborhood of P°®. Hence (Axiom 3) there is a 
transformation leaving A and Q fixed and carrying P into P®. This must be 
the reflection in the line AQ (§4 (D); §3, Lemma 3 and Theorem 2). There- 
fore P® is the point corresponding to Q. This completes the argument for this 
special case. 

(Il) Suppose the theorem false. Then, by (I), as some line, L, is traced 
from A in one sense or the other, a point, P, is reached which is either the last 
position for which K is a simple closed curve, or the first for which it is not. 
In the first case, by a proof like that of Lemma 1 in §2, K has a neighborhood 
consisting of points which remain within distance ¢ of K under all transforma- 
tions of the set* T4. Since this neighborhood contains P it follows from (I) 
that P cannot be the last point for which K is a simple closed curve. 

We deal with the second case by showing that if every internal point of 
the arc AP on L generates a circle which is a simple closed curve, then the 
circle, K, generated by P is also a simple closed curve. Let it first be noted 
that no two different lines (Z;, 22) through A can meet at a point, Q, on K. 
If they did, then, by §4(D), the arcs AQ on Z, and Zz would be distinct and 
hence form a simple closed curve, c. But then every line determined by A and 
a point inside c would clearly pass through Q. By reflections in Z, and ZL, one 
can show that other lines through A pass through Q; indeed, one can show 
that all lines through A pass through Q, from which it is easy to deduce a 
contradiction. Now, for any e>0, there exists a neighborhood, N,, of P, 
such that no image of P under the reflection defining a line through a point of 
N, is at distance greater than ¢ from P. This may be established by an argu- 
ment similar to that of Lemma 1 in §2. Consider the correspondence em- 
ployed in (I) above. We have seen that it is one-to-one even in the present 
case. We can establish its continuity by an argument like that in (I) applied 


* The proof of Corollary 1 below shows that these transformations preserve K. 


| 


1933] AXIOMS FOR PLANE GEOMETRY 243 


to images of P in N,, and, similarly, in a neighborhood of any point on K. 
Hence K is a simple closed curve. 


Coro ary 1. A circle with center at A is preserved by all the transformations 
T 4 (Axiom 3). 


Let 7 be a transformation of the set T,. Let P’’ be the image under r of 
any point P’ on K, and let p be the reflection which leaves A fixed and inter- 
changes (P, P’). Then, if p’ is the reflection defining the line AP’, pr and 
pp’r both carry P into P”’ and leave A fixed. The one which reverses orienta- 
tion is a reflection, for it preserves the curve y of Theorem 1 and therefore 
leaves two of its points fixed. Hence P”’ is on K. 


CoroLuaRy 2. A circle is met in just two points by a line through its center. 
(See §3, Lemma 2.) 

CoroLiary 3. Any two points (P;,P:) on a circle are interchanged by some 
reflection leaving the center fixed. 

Let p; be the reflection leaving the center, A, fixed and carrying P into 
P; (j=1, 2). If p define the line through A and P, the product ppp2 reverses 
orientation, carries P; into P; and preserves the circle. It is therefore a re- 
flection, for it leaves two points on the circle fixed. Hence (Theorem 2) it 
interchanges P,; and P». 

THEOREM 4. One and only one line passes through any two distinct points 
of the plane p. 

If two lines through a point, A, have a second point, P, in common, their 
defining reflections preserve the circle through P with center at A and leave 
P fixed. The product of these reflections is therefore the identity and the 
lines coincide (§3). 

Coro.iary 1. For any two points, A and B, on the plane p, there exists a 
reflection interchanging them. 

Let P be a common point of the two circles, centers at A and B, respec- 
tively, passing through B and 4A, respectively. Let p: be the reflection inter- 
changing A and P and leaving B fixed (Theorem 3, Corollary 3) and p: the 
reflection interchanging B and P and leaving A fixed. Then pip; reverses 
orientation, leaves P fixed, and carries A into B. Since p; and pz are involutory 
(Theorem 2), pipe: is also involutory and therefore leaves more than one 
point fixed (see proof of §4(A)). Hence it is the required reflection. 

Corotuary 2. Let L and L’ be any two lines. Let A be any point on L and 
B any point on L’. Then some transformation in the group T carries L into L’ 
in such a way that A goes into B. 


| 
t 
Hy 
i 


244 S. S. CAIRNS 


The reflection, pi, which interchanges A and B (Corollary 1) carries L 
into a line, L’’, through B (Theorem 2, Corollary). Some reflection, pz, by 
Theorem 3, Corollary 3, leaves B fixed and interchanges any two points in 
which L’ and L’’, respectively, meet a circle, center at B. The product pips 
satisfies the requirements of the present corollary. 

There remain no difficulties in defining angles, distances, congruences, 
and proceeding with other geometrical developments, or else establishing the 
axioms in Chapter 1 of Hilbert’s Grundlagen der Geometrie. 

Two geometries rest on the above foundation: the euclidean if one assume 
that through a given point not on a given line, LZ, there is but one line which 
fails to meet L; the Bolyai-Lobatchewsky if one assume that there are two 
lines through the point separating the intersecting from the non-intersecting 
lines. 


LEeHIGH UNIVERSITY, 
BETHLEHEM, PA. 


PROOF OF THE FUNDAMENTAL THEOREMS ON 
SECOND-ORDER CROSS PARTIAL DERIVATIVES* 


BY 
A. E. CURRIER 


1. Introduction. In the present article we prove the following theorem: 


Let f(x, y) be defined on an open region R, and let the first partial derivatives 
f. and f, exist on R. Let A be a point set on which the four second-order partial 
derivatives 


(1.1) faz, few Suz, Suy 


exist almost everywhere. Then 


fev = Suz 
almost everywhere on A. 


At the conclusion of this paper we state a number of interesting problems 
which are closely connected with the theorems which we prove. We wish to 
thank Dr. S. Saks for many helpful suggestions. 

2. Measurability of second partial derivatives. We require the following 
lemma. 


Lemma 1. Let u(x, y) be a function of Baire, and let A,, be the point set on 
which the following inequality is satisfied: 


h “a 
u(x + h, y) 


(2.1) 


1 
a, O< h<—> (x, y) < An. 
n 


Then A,, is measurable. 
We readily see that the function 


h 


is a function of Baire in the space of the three variables (x, y, k). Let A be the 
portion of the space (x, y, #) for which 0<h<1/n. Let &, be the portion of A 
for which the inequality (2.1) is satisfied. We see that the complement of the 


* Presented to the Society, December 27, 1932; received by the editors, May 20, 1932, and, in 
revised form, June 8, 1932. 


245 


| 

u(x +h, y) — u(x, y) 

| 


246 A. E. CURRIER [January 


set A, of Lemma 1 is obtained by projecting A—%, onto the space (x, y). 
Thus A, is a complementary analytic set, and hence measurable.* 
From Lemma 1 the reader will readily see that the following lemma is true. 


Lemma 2. Let f(x, y) be a function of Baire, defined on an open region R. Let 
the first partial derivative f, exist on R. Then the second partial derivatives fz: 
and f,, are measurable functions on their respective domains of definition. 


3. Cross sections of a closed point set. We make the following definition. 


DeFInitTion. If P is a closed point set in the plane (x, y) then P,(t) and P(t) 
will denote the cross sections of P with the lines x =t and y =t respectively. 


We shall require the following lemma. 


Lemna 3. If P is closed} then almost every point (xo, yo) of P is a density 
point in the linear sense of the cross sections P(x») and P,(yo), and in the super- 
ficial sense a density point of P. 


We denote by ¢(x, y) the characteristic function} of P and set 


9) = f oe, Ney. 


By a well known theorem the partial derivative y, exists and equals 
¢(x, y) almost everywhere in the plane. That is, y,(x, y)=1 almost every- 
where on P. This shows that almost every point of P is a density point in the 
linear sense of the corresponding cross section P,(x). A similar proof applies 
to the cross sections P,(y). It is well known that the superficial density points 
of P lie almost everywhere on P, hence Lemma 3 is correct. 

4. The approximate middle derivative. The middle difference quotient 
Af/d? of a function f(x, y) at the point (xo, yo) is defined as follows: 


Af _ + A, Yo + A) — f(xX0, Yo + A) — flxo + A, Yo) + flxo, Vo) 
2 


(4.1) 


The approximate middle derivative of f(x, y) at (0, yo) (when it exists) 
will be defined as the following approximate limit§: 


* N. Lusin, Sur les ensembles analytiques, Fundamenta Mathematicae, vol. 10 (1927), pp. 1-95, 
especially pp. 25-26. Also Lusin et Sierpinski, Sur quelques propriétés des ensembles (A), Bulletin de 
l’Académie de Cracovie, 1918, p. 44. 

+ The lemma holds if P is merely measurable. 

t The characteristic function of a point set equals one on points of the set and equals zero 
elsewhere. 

§ Cf. Lebesgue, Lecons sur l’Intégration, Paris, 1928, pp. 240-241, for a definition of approxi- 
mate limits. 


! 


1933] SECOND-ORDER CROSS PARTIAL DERIVATIVES 


A 
(4.2) approx-D,,f = approx-lim “f . 


5. The fundamental lemma. We now state the fundamental lemma as 
follows. 


FUNDAMENTAL Lemma. Let f(x, y) be a function of Baire defined on an open 
region R, and let the first partial derivative f,(x, y) exist on R. Let A be the subset 
of R on which the partial derivatives 


Sez) fev 


exist and take on finite values. Then* 
(5.1) approx-Dnf = fry 
almost everywhere on A. 


The functions f,, and f,, are measurable by Lemma 2, hence the set A is 
measurable. Let A, be the part of A for which the following inequalities are 
satisfied: 


(5.2) 


h 


(5.3) <n, 


for (x, y)<Anand0< |h|, |k| <1/n. 

We readily see that the sets A, cover the set A. Hence in order to prove 
the Fundamental Lemma it is sufficient to show that (5.1) holds almost 
everywhere on the set A,. 

By Lemma 1 the set A, is seen to be measurable, and hence by a well 
known theorem there exists a sequence {P;} of closed parts of A, which cover 
A, almost everywhere. Hence in order to prove the Fundamental Lemma it is 
sufficient to show that (5.1) holds almost everywhere on each closed part of 
Ap. 

Let P be a closed part of A,. From (5.3) it follows that the function f,, is 
bounded on P. Since f,, is measurable and bounded on P it is summable 
on P and by a well known theorem for almost every point (%o, yo) of P 


1 
(5.4) lim = fay(Xo, Yo), 
J ps 


* Equation (5.1) implies the existence of approx-D,,f, that is, approx-D,,f exists and equals f,, 
almost everywhere on A. 


247 
j 
| 
| 
hi | 
if 


248 A. E. CURRIER [January 


where 6 is the square with one corner at the point (*o, yo) and the opposite 
corner at the point (*o+A, vo+A). Moreover by Lemma 3 almost every point 
(x0, yo) of P is a density point in the linear sense of the cross sections of P and 
in the superficial sense a density point of P. Hence in order to prove that 
(5.1) holds almost everywhere on P it is sufficient to show that (5.1) holds 
for the point (xo, yo) of P, where (xo, yo) is a density point of P in the linear 
and superficial senses and is also a point for which (5.4) holds. Thus we see 
that in order to prove the Fundamental Lemma it is sufficient to prove the 
following auxiliary lemma. 


AvxILiAry Lema. Let the hypotheses of the Fundamental Lemma be satis- 
fied. Let P be a closed part of A such that the inequalities (5.2) and (5.3) are 
satisfied for (x, y) on P. 

Let (xo, yo) be a density point in the linear sense of the cross sections of P and 
in the superficial sense a density point of P. Moreover let (xo, yo) be a point of P 
at which equation (5.4) holds. 

Then approx-D,, f exists at (xo, yo) and equals fzy(%o, Yo). 


6. Proof of the Auxiliary Lemma. Let e be the set of values of \ corre- 
sponding to which the points (x, yo+A) lie in P. We see that the point \=0 
is a density point of e, since (x9, yo) is a density point in the linear sense of the 
cross section P,(%) of P. For \ in e and constant we see by (5.2) that 
f.(x, Yo+X) is uniformly bounded for |«—«»)|<1/n. Hence the middle in- 
crement Af at (xo, yo) can be expressed by means of an integral as follows: 


(6.1) Af = [f2(x, yo+ A) yo) 


As before, let 5 be the closed square with one corner at (xo, yo) and the 
opposite corner at (%o+A, yo+A). Let e; be the projection of Pé on the x-axis, 
and let ¢: be the complement of e, with respect to the closed interval #»<x*< 
(or XH +A SX if X<0). 

Let O denote the ordinate which passes through the point (x, 0). Then the 
product set OP is the cross section P,(«) of P. The set OP6 is closed and non- 
empty for x in e,. The complementary set 05 —OP%6 is open on O46 and consists 
of at most a denumerable infinity of open linear intervals on the ordinate O. 
Let («, a) and («, 8) be the end points of a general one of these intervals. In 
the difference quotient 


A(z, B) f(x, a) 
at least one of the points (x, a), (x, 8) is a point of P. Moreover |8—a < |A|. 


1933] SECOND-ORDER CROSS PARTIAL DERIVATIVES 


Hence by (5.3) we have 


1 
(6.2) | B) — falx, a) | <n] <—- 


Hence for x in e, and |A|<1/n the difference f.(x, yo+A) —f.(x, yo) can be 
expressed as follows:* 


(6.3) f2(x, yo + A) — f2(x, yo) = fey(x, y)dy + B) — f.(x, a)]. 


Comparing (6.1) and (6.3) we see that for \ in e and |\|<1/n the mid- 
dle increment Af can be expressed in the form 


Af « f dx f fer(x, + f Ll B) — f.lx, a) Jax 
(6 4) ey OPé 


+f [f2(x, yo +d) — f2(x, yo) 


As remarked above the point \=0 is a density point of e. Hence in order 
to prove the Auxiliary Lemma it is sufficient to show that 


Af 
(6.5) lim — = f.,(%o, Yo) 
A? 
where the notation on the left indicates that \ is to approach zero through 
values which lie in e. 

The first integral on the right of (6.4) is merely the double integral of 
fry(x, y) taken over P5. Because of (5.4) we see that in order to prove (6.5) 
it is sufficient to show that the following equations are true: 

1 
(6.6) lim — Dd [f2(x, B) — fe(x, |dx = 0, 


Je, 
1 
(6.7) f lie, — fle, = 0. 


A? 


Because of (5.3) we see that the absolute value of the sum in the integrand 
(6.6) is bounded by 


n >.|B — a| = mu(O5 — OPS) 
a,p 


* Cf. Lebesgue, loc. cit., pp. 210-211. The series in (6.3) is readily seen to be absolutely con- 
vergent by (6.2). 


249 

4 

i 

¢ 


250 A. E. CURRIER [January 


where u denotes linear Lebesgue measure on the ordinate O. Hence the ab- 
solute value of the expression (6.6) is bounded by the following expression: 
1 m(5 — 
(6.8) fm OP35)dx = np(O6 OP5)dx = 
where m denotes superficial Lebesgue measure. Since (%o, yo) is a superficial 
density point of P the expression on the right of (6.8) converges to zero with 
d. Hence (6.6) is correct. 
To prove (6.7) we rewrite the integrand as follows: 


Yo+ A) f(x, yo) yo (xo, Yo+ d) 
+ Yo +) — Yo) + f2(%0, Yo) — 
Thus we see that in order to prove (6.7) it is sufficient to show that the follow- 
ing three equations hold: 
1 
(6.10) lim =f [f.(x, vo +d) — f2(x0, Yo + d) |dx = 0; 


A—,0 


(6.9) 


1 
(6.11) lim — f yo +2) — fale, = 0; 


A—,0 2 


1 
(6.12) lim a [f2(x0, vo) — f2(x, yo) |dx = 0. 


For in e the point (xo, yo+A) lies in P and for \ in e and || <1/mn we see 
from (5.2) that the absolute value of the expression (6.10) is bounded by 


veo 


1 
(6.13) d| = 


where v denotes linear Lebesgue measure in the x-space. From the fact that 
(xo, Yo) is a density point in the linear sense of the cross section P,(yo) of P we 
readily see that the expression on the right of (6.13) converges to zero as 
approaches zero. Hence (6.10) is correct. In a similar way we prove that (6.11) 
and (6.12) are correct. Hence the Auxiliary Lemma and the Fundamental 
Lemma of §5 are correct. 

7. The first theorem on second-order partial derivatives. We now prove 
the following theorem. 


THEOREM 1. Let f(x, y) be a function of Baire defined on an open region R, 
and let the first partial derivative f.(x, y) exist on R. Let A be a point set on which 
the partial derivatives f., and f,, exist almost everywhere. Then the approximate 
middle derivative approx-D,,f exists almost everywhere on A and 


i 


1933] SECOND-ORDER CROSS PARTIAL DERIVATIVES 


(7.1) approx-Dnf = fry 
almost everwhere on A. 


Let A’ be the part of A on which f,, and f,, exist and take on finite 
values. The set A — A’ has superficial measure zero,* hence in order to prove 
Theorem 1 it is sufficient to show that (7.1) holds almost everywhere on A’. 
Let A”’ be the set on which f,, and f,, exist and take on finite values. The set 
A’ is part of A’’. By the Fundamental Lemma approx-D,,f exists and equals 
fz, almost everywhere on A’’, hence almost everywhere on A’. Thus Theorem 
1 is correct. 

8. Equality of the cross partial derivatives. We now prove the following: 


THEOREM 2. Let f(x, y) be defined on an open region R, and let the first par- 
tial derivatives f.(x, y) and f,(x, y) exist on R. Let A be a point set on which the 
four second-order partial derivatives f.2, fey, fyz, fyy exist almost everywhere. Then 
fev = [yz almost everywhere on A. 


Since f, and f, exist, the function f(x, y) is continuous in x alone and con- 
tinuous in y alone, and is thus a function of Baire. By Theorem 1 approx-D,,f 
=f,, almost everywhere on A. By reasoning similar to the proof of Theorem 1 
we see that approx-D,,f =f,. almost everywhere on A. That is 


fey = approx-Dnf = fyz 


almost everywhere on A. Thus Theorem 2 is seen to be true. 

9. Generalizations of Theorems 1 and 2. In Theorems 1 and 2 it is not 
necessary to assume that the first derivatives exist. We make the following 
definition: 


DEFINITION 1. Let f(x, y) be an arbitrary function, and let 


(9.1) ft(x, y), fet(x, 9), y), 


be the four principal first partial derivatives} of f(x, y) with respect to x. The 
second partial derivative f.. is said to exist at (xo, yo) if the first derivative 
fa(%0, Yo) exists (that is, if the four functions (9.1) have the same value at (xo, yo)) 
and if the four functions (9.1) are partially differentiable with respect to x at 
(x0, vo) and have the same value for their partial derivative with respect to x at 


(x0, Yo). 


* Theset A —A’ consists of the part of A on which f,, or fz, does not exist (this part is of measure 
zero) plus the part of A on which either fz or fzy is infinite. The latter part also has measure zero. 
The function f,2 is measurable, and the linear measure of each cross section of the part of A on which 
fzz2= + © is zero, as proved for example in Hobson, Theory of Functions of a Real Variable, Cambridge, 
1927, vol. I, p. 397, Theorem 2. Similarly for infinite values of f.,. 

¢ Cf. Carathéodory, Vorlesungen iiber reelle Funktionen, Berlin, 1927, p. 641. 


4 
{ 
¥ 
‘Ba 
fe 


252 A. E. CURRIER [January 


We make similar definitions for the existence of the remaining second par- 
tial derivatives. We now state certain generalizations of Theorems 1 and 2. 


THEOREM 3. Let f(x, y) be defined on an open region R, and be continuous 
in x alone and in y alone. Let D,f be one of the principal first partial derivatives 
(9.1). Let A be a point set on which the partial derivatives 


Def Def 
oy 


exist almost everywhere. Then 


(9.2) approx-D,f = — D.f 
dy 


almost everywhere on A. 


It is well known that under the hypotheses of Theorem 3 the function 
D,f is a function of Baire. The proof of Theorem 3 now follows in the same 
way as the proof of Theorem 1, after a suitable restatement and reproof of 
the Fundamental Lemma. No new methods are required. 

THeEorREM 4. Let f(x, y) be defined on an open region R, and let f(x, y) be 
continuous in x alone and in y alone. Let A be a point set on which the second 
partial derivatives 


Sez, few Suz, 
exist in the generalized sense almost everywhere. Then 
Sev = fuz 


almost everywhere on A. 


The proof of this theorem follows readily, making use of Theorem 3. 

10. Problems. Certain problems present themselves at once. We state a 
few of them here. 

ProseM 1. Let f(x, y) be a function of Baire. Are the principal first partial 
derivative functions (9. 1) also functions of Baire*? 


If the answer to Problem 1 is in the affirmative the following theorem 
follows at once by the methods of this paper. 


* For a function f(x) of one variable the answer is in the affirmative; cf. Sierpinski, Sur les 
fonctions dérivées des fonctions discontinues, Fundamenta Mathematicae, vol. 3 (1922), pp. 123-127. 
In case f(x,y) is continuous in x the answer is in the affirmative. 


" | 


1933] SECOND-ORDER CROSS PARTIAL DERIVATIVES 253 


Let f(x, y) be a function of Baire. Let A be a point set on which the second 
partial derivatives fiz, fey, fyz, fyy exist in the generalized sense almost every- 
where. Then fzy=fyz almost everywhere on A. 


PRoBLEM 2. Let f(x, y) together with its first partial derivatives f, and f, be 
continuous on an open region R. Let A be a point set on which fz and fry exist 
almost everywhere. Then we know that approx-Dnf exists and equals f,, almost 
everywhere on A. Does the actual middle derivative Daf exist almost everywhere 
on A? 


Pros_eM 3. Is Theorem 1 true if f(x, y) is merely measurable? 


It is probable that an example can be constructed which will show that 
the answer to this problem is in general negative. 

A large number of related problems can be readily thought of, problems 
which have to do with the reversal of the order of integration in iterated 
integrals, and problems connected with two-dimensional totalization. It is 
of course quite obvious that theorems similar to Theorems 1, 2, 3, and 4 can 
be stated concerning the third partial derivatives f,,. etc. of a function 
f(x, y, 2) of three variables, and corresponding general theorems for the partial 
derivatives of functions of m variables. 


HARVARD UNIVERSITY, 
CAMBRIDGE, Mass. 


+ 


THE CANCELLATION LAW IN THE THEORY OF 
CONGRUENCES TO A DOUBLE MODULUS* 


BY 
MORGAN WARD 


1. Let m be an integer greater than unity and f(x) a fixed polynomial 
with integral coefficients. } If the leading coefficient of f(x) is prime to m, then 
the quotient and remainder obtained on dividing any other polynomial by 
f(x) have integral coefficients modulo m. Hence, as is well known, all poly- 
nomials may be separated into a finite number of residue classes A, B,---, 
u, - - - which form a commutative ringf with respect to the operations of 
addition and multiplication (modulis m, f(x)). I propose here to determine 
what inferences can be drawn concerning the ring elements U and % from 
the ring equality AYU =ABV when Since YU = AB is equivalent to 
%(U—B)=0, we may assume that V=0. Stated in terms of congruences 
our problem is then equivalent to the following one: 

Suppose that f(x) =cox*+cyx*-!+ +--+ +c, is a fixed polynomial with in- 
tegral coefficients co, - - - , ¢, and that m is an integer prime to Co. Let A(x) be 
a given polynomial such that 


A(x) #0 (modd m, f(x)). 
To determine all polynomials U(x) such that 
(1.1) A(x)U(x) =0 (modd m, f(x)). 
The problem is essentially a generalization of the problem of solving 


au =0 (mod m) 


for given integers a and m. Nevertheless it does not seem to have been con- 
sidered heretofore save in very special cases. 

I shall first of all show that it is sufficient to consider the case when m 
is a power of a prime ~, say m=", and when f(x) is congruent modulo p 
to a power of an irreducible polynomial ¢(x) (mod ); 


f(a) = B(x) = {o(x)}? (mod 9). 


* Presented to the Society, August 31, 1932; received by the editors May 24, 1932. 

+ We shall restrict the term polynomial in what follows to mean a polynomial with integral 
coefficients. 

t van der Waerden, Moderne Algebra, Berlin, 1930, vol. I, p. 37; Haupt, Einfiihrung in die Alge- 
bra, Leipzig, 1929, vol. I, chapter V. 


254 


| 

ni 


CONGRUENCES TO A DOUBLE MODULUS 255 


This reduction corresponds to the fact that the ring associated with the 
moduli m and f(x) is the direct sum of rings of the type associated with the 
moduli p¥ and B(z). 

In this simpler case I shall show that there exists a positive integer A 
and a set (S) of \ polynomials 


A(x), A,(x), Ay-1(x) 


where \ and (S) depend only upon A (x) and B(x) and are independent of N, 
such that 


A(x)U(x) = 0 (modd p”, B(x)) 
when and only when 
= pY(Qo(x)Ao(x) + + + if 


if 


U(x) 


the polynomials Q(x) being completely arbitrary, save for a restriction upon 
their degrees which we shall give later. 

In the ring associated with the double modulus p*, B(x), our results are 
equivalent to the theorem that the ideal to which every element U of the ring 
belongs which satisfies the relation 

= 0 
has a basis of the form 
or of the form 


where \ and %o, - - - , W%-1 depend only upon A, » and B(x) and are inde- 
pendent of 
2. Suppose that 


m= ppr 


is the decomposition of m into its prime factors. Then it is readily seen that 
a necessary and sufficient condition that the congruence (1.1) hold is that the 
congruences 


(2.1) A(x) U(x) =0 (modd pi", f(x)), i= 1, 


hold. Furthermore, if we know the general solution U‘?(x) of each of the 
congruences (2.1), the general solution of the congruence (1.1) can be written 


9 

| 
rt 
it 
id 
hy 


256 MORGAN WARD [January 


down immediately by means of the Chinese remainder theorem.* Hence it is 
sufficient to discuss the case when m=", p a prime. 
Let 
f(x) = co{ } { } (mod (co, p) 1, 


be the decomposition of f(«) into primary irreducible polynomials modulo p. 
Then by Schénemann’s second theorem? there exists a decomposition of 
f(x) modulo p” of the type 


f(x) = co - - - (mod p*), p) = 1, 
where the polynomials B,(x) are primary, and 
(2.2) Bi(x) = (mod 
Since Res{ B,(x), B,{x)} is prime to p if i¥j, it easily follows that (1.1) 
holds with m=p* when and only when the s congruences 
A(x)U(x) =0 (modd pp’, B(x)),i = 1,---,5, 


hold. If the solutions of these congruences are known, then the solution of 
the congruence (1.1) may be written down by the procedure of the Chinese 
remainder theorem. 

It is sufficient then to study the congruence 


(2.3) A(x)U(x) = 0 (modd B(x)) 
where 


A(x) = + + + dn, B(x) = + + --- + dn 


are given polynomials, is a prime number, WN a positive integer, and U(x) 
is to be determined. Furthermore 


(2.4) B(x) = (mod 


where ¢(x) is a primary irreducible polynomial modulo p and £ a positive 
integer. We shall not need to use this last fact in what immediately follows. 
Finally, we lose no generality by requiring that U(x) be of lesser degree than 
B(x). 

3. The first problem is to determine for a given N the highest power of 
p which divides every U(x) satisfying (2.3). We shall show that there exists 
an integer \ depending only upon A(x) and B(x) such that if VN >, every 
solution of (2.3) is divisible by p”~, while if N SX, there exist solutions of 


* Dickson, Introduction to the Theory of Numbers, Chicago, 1929, p. 10. 
t For an account of Schénemann’s theorems, see Fricke, Algebra, Braunschweig, 1928, vol. II, 
chapter 2. 


1933] CONGRUENCES TO A DOUBLE MODULUS 257 


(2.3) which are not divisible by ». If N>A, we may therefore write 
U(x) =p"-W (x) and thus reduce (2.3) to a congruence of the same form 
with V =X. We shall see in the next section that the discussion of (2.3) when 
N S) presents no difficulties whatsoever. 


Let 
denote the (m-+m)-rowed Sylvester eliminant of the polynomials A(x) and 
B(x), and let 
E = (e;;) (i,j =1,--+,m+n) 


denote the transpose of the matrix of the determinant £. 
Suppose that 


E=p"E’ where L20, (p, E’) =1. 
The congruence (2.3) is equivalent to an identity in x of the form 


A(x) U(x) + B(x) V(x) = p*W(z) 


where the polynomial V(x) is at most of degree »—1 and the polynomial 
W (x) at most of degree m+n —1. If we denote the m+n unknown coefficients 
of U(x) and V(x) in order by 21, 22, - - - , m+n and the coefficients of W(x) 
by wi, We, - - * , Wm4n, then this identity is easily seen to be equivalent to the 
system of m+n linear equations 


i=l 


where for brevity we have written ¢ for m+n. The determinant of this system 
is E; hence 


t 
Ez; = (j=1,---,4), 


é;; denoting the co-factor of e;; in E. Suppose that p” is the highest power of 
p dividing all of the first minors + é;; of E. Then on writing p”EZ’ for E, we 
see that 


t 
E's; = pN+D-L Gj=1,---,d 
i=l 


where pe}; = ;;. At least one of the numbers e;; is not divisible by p; suppose 
that it is e,:. Then on taking w; equal to 1 and the remaining w equal to 0, 


4 
a 
¥ 
i 


258 MORGAN WARD [January 


we obtain a solution of (3.1) such that every z is divisible by p¥+?-4 and at 
least one z, namely z;, is not divisible by any higher power of p. It follows 
that the highest power of p dividing all solutions of (3.1) is p¥+9-4, 

The integer p4~” is simply the first elementary divisor of the matrix € 
corresponding to the prime factor p. Writing \ for L—D, we have the fol- 
lowing result: 


The least value of N such that every solution U(x) of (2.3) of degree less than 
B(x) should be divisible by p™ is T+, where p* is the first elementary divisor 
corresponding to the prime p of the matrix of the eliminant of A(x) and B(x). 


Consequently, if N <i, there exist solutions of (2.3) which are not divis- 
ible by ~, while if NV >X, every solution is divisible by p¥—. Since \=0 only 
when L=D=0, we must have U(x)=0 (mod p*) if the resultant of A(x) 
and B(x) is prime to p. In the ring associated with p” and B(x), the corre- 
sponding case is when AU =0 and A is a unit of the ring. 

4. We can now complete the discussion of the congruence (2.3). If V>\, 
set U(x) =p"—W (x) thus obtaining the congruence for W (x) 


(4.1) A(x)W(x) =0 (modd p*, B(x)). 


Among the polynomials W(x) which satisfy (4.1) are some not divisible 
by p. Let T(x) be such-a one of lowest possible degree. Then the leading coef- 
ficient of 7(x) must be prime to 9; for if not, by Schénemann’s first theorem* 
there would exist a polynomial of the form c+ Q(x) where c is prime to p 
such that T(x)(c+pQ(x)) would be congruent modulo p* to a polynomial 
T’ (x) of lower degree than T(x). Then, since Res {c+pQ(x), B(x)} is prime 
to p, we would have A(x)T’(x)=0 (modd p*, B(x)) contradicting our as- 
sumption about the degree of T(x). On multiplying T(«) by a constant prime 
to ~, we obtain a polynomial A,(x) with leading coefficient unity and of 
minimal degree satisfying (4.1). This polynomial is unique modulo p*; for the 
difference of two such polynomials would be of lesser degree than either. 
Moreover if W(x) is any solution of (4.1), the quotient and remainder ob- 
tained on dividing W(x) by Ao(x) have integral coefficients and the re- 
mainder being of lower degree than A (x) must be divisible by ». Hence 


W(x) = Qo(x)Ao(x) + pWi(x) 


where W(x) is of lesser degree than A(x). On substituting this expression 
in (4.1), we obtain a congruence of the same form for W(x): 


A(x)Wi(x) =0 (modd B(x)). 


* Fricke, loc. cit., p. 59. 


Hy 
i 
4 


1933] CONGRUENCES TO A DOUBLE MODULUS 259 


We now repeat the previous argument. Every solution of this congruence 

must be of the form 

Wi(x) = Qi(x)Ai(x) + pW2(x) 
where A;(x) is a solution of minimal degree in x with leading coefficient unity 
uniquely determined modulo p'-', while W2(x) is of lesser degree than A;(x). 

We find on continuing in this manner that the general solution of (4.1) 
is of the form 

W(x) = Qo(x)Ao(x) + pQi(x)Ar(x) + + Ara(x) 
where the polynomial A ;(x) is uniquely determined modulo p-‘. 

We shall show in the next section that two consecutive polynomials 
A,(x) and A,4:(x) are equal only when all the polynomials A,(x), A,+:(x), 
A,42(x), - - - , Ay-1(%) are equal, a circumstance which may occur for special 
choice of A(x) and B(x). If the degrees of A,(x) and Q;(x) are a; and ¥; re- 
spectively, then it is clear that 

Qi — > 2 O (i =0,1,---,r—1). 

The modification when the initial value of N is less than X is obvious, 
and will be left to the reader. The results stated in the beginning of the paper 
are thus established. 


5. We shall conclude by showing how the polynomials Ay_:(x), Ay-2(x), 
- + , Ao(x) may be determined. We first observe that since 


(5.1) A(x)A,(x) =0 (modd B(x)) 
we have A(x)A;(x)=0 (modd p-‘-!, B(x)). Therefore by the fundamental 
property of Ai+:(x), 

(5.2). A,(x) =0 (modd A i4i(x)) (i 0, 1). 


We have seen in §2 that we may assume that B(x) is of the form 
{ p(x) }*+ pV (x) where ¢(x) is primary and irreducible modulo p. If we con- 
struct a Schénemann decomposition of A(x) modulo p’, it is easily seen that 
we may assume that A(x) is of the same form; thus 


(5.3) B(x) = {o(x)}*+ pV(x), A(x) = {o(x)}* + pR(x) 


where a<f, and the degrees of V(x) and R(x) are less than those of B(x) 
and A(x) respectively. Hence 
A(x) {o(x) = pR(x) {o(x) — pV(x) (mod B(x)). 
If p™ is the highest power of dividing the right side of this last congruence, 
we have 
A(x){¢(x)}** =0  (modd p™, B(x)), #0  (modd B(x) 


1 
4 
i 
i 
2 


260 MORGAN WARD 


and we may take 
= Ayo(x) = = = (mod p*). 
Let i denote an integer <A\—M. Then 

(5.4) A(x)A(x) = p‘S:(x) (mod B(z)), 


where S;(x) is of lesser degree than B(x). 

We may assume that S;(x) is not divisible by p and is of lesser degree than 
A(x). For since A,(x) is determined only modulo p~‘, if S;(x) =pSi(x) we 
have 

A(x)(Ax) + = p-*(A(x) + pSi (x)) (mod B(x)) 
and by (5.3), the polynomial multiplying p~‘ on the right is not divisible 
by p. In the same way, if S;(x) =Q(x)A(x)+5;'(x) where Si (x) is of lesser 
degree than A(x), then Q(x) is necessarily of lesser degree than A ;(x) so that 
Ai(x)+ Q(x) is a primary polynomial such that 


A(x)(Ai(x) + pP‘Q(x)) = (2) (mod B(z)). 


If A(x) is known, we can determine A;_1(x). For, by (5.2), 
Ai(x) = Q(x)A(x) + pR(x) 
where Q(x) must be primary, and R(x) of lesser degree than A(x). By (5.4), 


A(x)Ava(x) = p‘Q(x)S(x) + pR(x) A(x) (mod B(z)). 
Take R(x) =p-*!, Then 
A(x)Ais(x) = 0 (modd p-**!, B(x)) 
when and only when 
Q(x)Si(x) + A(x) = 0 (modd B(x)), 
that is, when and only when 
Olx)Si(x) + {o(a)}* = 0 (modd p, {¢(+)}*). 


Since S,(x) is known and is of lesser degree than {¢(x) }« and not divisible 
by ?, there exists a primary polynomial Q(x) uniquely determined modulo p 
which satisfies this congruence. A ;_:(x) is now uniquely determined modulo 
p-**! and may be modified so as to satisfy the conditions corresponding to 
those imposed upon A (x) in (5.4). 

The remaining polynomials A,_y_:(x), - - - , A(x), Ao(x) can therefore 
be calculated step by step, and our solution is completed. 


CALIFORNIA INSTITUTE OF TECHNOLOGY, 
PASADENA, CALIF. 


it} 


A CHARACTERIZATION OF THE CLOSED 2-CELL* 


BY 
HASSLER WHITNEYt 


1. Introduction. A number of characterizations have been given of the 
simple closed surface.{ The proofs involve considerable point set difficulties. 
We give here a characterization of the closed 2-cell, that is, a point set homeo- 
morphic with a circle and its interior. The fundamental theorem is partly of a 
combinatorial and partly of a continuity nature. It reads 


TueoreM I. Let R be a continuous curve § containing the simple closed curve 
J, such that 

(1) J is irreducibly homologous to zero in R, and 

(2) If y is an arc with just its two end points a and b on J, then R—v is not 
connected. 

Let R' and J’ be defined similarly. Then R and R’ are homeomor phic, with 
J corresponding with J’. 


That R is a closed 2-cell then follows immediately from the following 
theorem. We note that J corresponds with the circle, that is, J is the bound- 
ary of R. 


THEOREM II. /f I is a circle in the plane and S is I with its interior, then S 
and I satisfy the conditions prescribed for R and J in the above theorem. 


The exact meaning of Condition (1) of Theorem I is given in §4; a stronger 
condition is the following: For every «>0 and any two points a and ) on J, 
there is a set of points a;; in R, 1Sism, 1<jSn, such that all points a; 
coincide with a, all points a,,; coincide with 5, all points a, lie on one arc ab 
of J, all points a;, lie on the other arc ab of J, and|| 


P(Gij, <€, Giz, @i,j41) < €; 


moreover, this does not hold in any proper subset of R containing J. 


* Presented to the Society, October 31, 1931; received by the editors April 13, 1932. 

t National Research Fellow. 

t That is, a point set homeomorphic with the surface of a sphere. See L. Zippin, American Jour- 
nal of Mathematics, vol. 52 (1931), pp. 331-350; these Transactions, vol. 31 (1929), pp. 744-770; 
C. Kuratowski, Fundamenta Mathematicae, vol. 13 (1929), pp. 307-318; also references in these 
papers. 

§ See Lemma A. 

|| g) = distance from to g, or in general, distance between two point sets; 6(S) =diameter of 
S; V.(S)=those points p for which p(p, S)< e; W.(S)=those points p for which p(p, S) Se. 


261 


if 
4 
| 
4 


262 HASSLER WHITNEY [January 


Notations and preliminary theorems are given in $§2, 3 and 4; an outline 
of the proof of Theorem I will be found in §5. The Jordan and related 
theorems follow of course from the above theorems. 

2. Point set background. Elementary properties of point sets we shall 
need may be found in Hausdorff, Mengenlehre, chapter VI. A continuous 
curve is a metric space which can be expressed as the continuous image of a 
closed line segment. An arc is the topological image of a closed line segment; 
a simple closed curve, the topological image of a circle. 

Two fundamental lemmas are the following: 


Lemma A.* A compact, connected and locally connected metric space is a 
continuous curve, and conversely. 


Lemma B.} A continuous curve is arcwise connected. 


That is, any two points # and q in the set are end points of an arc pg in 
the set. Using the definition of a continuous curve, it is easily seen that two 
continuous curves which have common points form a continuous curve. 

From these lemmas we deduce the following known theorems. 


Lemma C. Any continuum C of diameter <e in a continuous curve R is 
contained in a continuous curve C’ in R of diameter <e. 


Say 5(C) =e—<’. R being the continuous image of a closed line segment, 
we can divide this segment into segments so small that the diameter of the 
image of each is <e’. We let C’ be the union of all of these images which have 
points in common with C. 


Lemma D. A continuous curve R is locally arcwise connected. 


That is, given a point p and an e>0; there is a 5>0 such that if g ¢ V;(p), 
then there isanarc pg in R of diameter <e. As R is locally connected, we can 
take 6 so that if g¢ V,(p), there is a continuum C in R of diameter <e con- 
taining p and g. The continuum C is contained in a continuous curve C’ of 
diameter <e, and C’ is arcwise connected; hence there is an arc pgcC’ cR, 
and 6(pq) <e. 

R is of course uniformly locally arcwise connected, by the Borel Theorem. 


Lemma E. A connected open subset R’ of a continuous curve R is arcwise con- 
nected. 


* See G. T. Whyburn, Concerning continuous images of the interval, American Journal of Mathe- 
matics, vol. 53 (1931), pp. 670-674. 

t See references in R. L. Moore, Report on continuous curves, Bulletin of the American Mathe- 
matical Society, vol. 29 (1923), p. 293, footnote (f). 


hed 


1933] CLOSED 2-CELLS 263 


If there are two points p and gq in R’ which are joined by no arc in R’, 
let A contain p and all points of R’ joined to p by an arc in R’, and put 
B=R’—A,; then there is no arc joining a point of A to a point of B in R’. 
As R’ is connected, there is a point p’ in one of these sets, say B, which is a 
limit point of points of the other set, A. As R’ is open in R, p(p’, R—R’) = 
e>0. We can take q’ in A so close to p’ that there is an arc p’q’ in R of dia- 
meter <e. But then ’q’ ¢ R’, a contradiction. 

Suppose R is connected, and pc R is such a point that R—? is not con- 
nected. Then ? is called a cut point of R. 


Lemma F. Let R be a continuous curve without a cut point. Then for every 
e>0 there is a 5>0 such that if p(q, p) =e and p(q’, p) Ze, then there is an arc 
qq’ with no points in V;(p). 


Suppose the contrary. Then there are three sequences of points {p,}, 
{an}, {¢/}, approaching points p, g, q’, respectively, with p(gn, Px) 2€, 
p(n, Pn) 2€, and such that for each m, any arc 9,g, must contain points in 
V;,(p.), where lim,...5,=0. By Lemma D it is seen that for any m greater 
than some WN there are arcs 9ng, gx g’, with no points in V,,2(p). It follows that 
any arc gq’ must pass through #, contradicting Lemma E (as # is not a cut 
point). 

3. Combinatorial background.* A k-simplex, or abstract k-simplex, is a set 
of k elements (say points) a:d2 - - - a,. The order in which we write the points 
is immaterial. For k=0, 1 and 2 we use also the terms vertex, segment and 
triangle respectively. A k-chain is a set of k-simplexes, and is written as the 
sum of these simplexes. The swum (mod 2) of several k-chains is the k-chain 
containing those simplexes which occur in an odd number of the &-chains. 

The boundary K of a k-simplex L, k>O, is the sum of all (k—1)-simplexes 
formed by dropping out one of the vertices of the simplex. We write L>K. 
A 0-simplex has no boundary. Thus 


a—0,ab—a-+t b, abc > ab + ac + be. 


The boundary of a k-chain is the sum (mod 2) of the boundaries of the sim- 
plexes of the chain. Thus 


ab + bec + cd—-a+td, abc + ab + ac + bd + cd. 


Evidently the boundary of a sum of several k-chains is the sum of the boundaries 
of the chains. If a k-chain has no boundary, it is called a k-cycle. (Any 0-chain 
is a 0-cycle.) The boundary of a k-chain (k>O) is a (k—1)-cycle. This is evi- 


* Compare L. Vietoris, Uber den héheren Zusammenhang kompakier Riume, Mathematische 
Annalen, vol. 97 (1927), pp. 454-472. 


i 
al 
if 
| 
i 


264 HASSLER WHITNEY [January 


dent if the k-chain is a k-simplex. The general case then follows from the last 
theorem. 

Lemma G. If K->a+6 is a 1-chain, then there is a chain of segments aa, 


For otherwise we could divide the segments of K into two groups Ki>a@ 
and K:>5, no two simplexes from different groups having a common vertex. 
But then K,—a, K.—b, which cannot be, as the boundary of any 1-chain con- 
tains an even number of vertices. 

A 1-circuit is a 1-cycle of the form d2d3, @n—1@n, the vertices 
being distinct except as shown. 


Lema H. Any 1-cycle K is a sum of 1-circuits. 


If is a segment of K, then as and 
We can thus find a set of distinct segments and vertices a1d2, - - - , @n—1@n in 
K+aa, not containing a,¢,. This with aa, is a 1-circuit Ki. As K:-0, 
K+, is a 1-cycle containing no segments of K;, and it contains a 1-circuit 
K,. Continuing, we find K=K,+Ke2+ ---+Kn. 

4. A k-chain K is said to lie in a point set R if each vertex of K is in R. 
Any vertex now has both a name anda position. Two vertices are distinct if 
their names are distinct, irrespective of whether they coincide in position or 
not. ¢ being a positive number, a k-simplex K ¢ R is called an (e, k)-simplex 
in R if 5(K) <e, i.e. if any two vertices of K are within ¢ of each other. A k- 
chain is an (e, k)-chain if each of its simplexes is an (€, &)-simplex. A k-cycle 
K in S is said to be e-homologous to zero (Ke~0) in R if there is an (€, +1)- 
chain L in R of which K is the boundary. Jf Kie~0 and Kz e~0, then K,+ Kz 
e~0. We write also Kie~Ke for Kit+Kee~0. If Kie~K2 and Kre~Ksz, 
then Kie~Ks3. 

Suppose the closed set R contains the simple closed curve J. If for every 
e>0 there is a 5>0 such that any (6, 1)-cycle on J is e~0 in R, then we say 
that J~0 in R. If J is ~0 in R but is not ~0 in any proper closed subset of 
R containing J, then we say that J is irreducibly ~0 in R. 


Lemma I. Given a simple closed curve J, let us divide it into the arcs* ayaz, 
* * * Andy, each of diameter <«/2. Let & be smaller than the dis- 
tance between any two of these arcs which have no common points. Then if 
K’ 4203+ -- - +a,a; and K is any (6, 1)-cycle on J, K is either e~0 
or e~K’ on J. 


By Lemma H, K is a sum of 1-circuits Ki, - - - , Km. If we show that each 


* Here, a,a2 denotes an arc, and a;a2, a segment. 


+ 


1933] CLOSED 2-CELLS 265 


K; is e~a;K’, a;=0 or 1, it will follow that K= e~ or K’ 
(depending on whether >>a; is even or odd), and the lemma will be proved. 

Consider any K;=},b2.+2b3+ - - - +0,b:, say. If a vertex 5; of K; does 
not lie on any point a,, say b;¢a,a441; add to K; the boundary of the e 
triangles b;_:b,a¢ +6,b;,:a7 , where a is a new vertex lying on a;. The result 
is an (¢, 1)-circuit K; e~K;, the vertex 5; having been replaced by the ver- 
tex a . Repeat the process till we have an (e, 1)-circuit K’’ =cic2+ce¢s+ - - - 
+60, e~Ki. 

Now any two consecutive vertices c;, ¢j:1 lie on the same or consecutive 
vertices of K’. Suppose c; is on a; and ¢;42 is On @44p, p~2 or —2. Then add the 
boundary of ¢;¢;4:Cj42, replacing the segments ¢;Cj41+¢;4:Cj42 by the single 
segment ¢;¢;,2. Continue till we arrive at a (possibly void) (¢, 1)-circuit K* 
--- +d,die~K;,. If d; lies on then lies on where we 
put etc. 

If K* contains no segments, K; e~0. Otherwise, following the vertices 
d,, do, - - - , d,, d, of K*, we have gone around J p times say. Add to K* the 
boundaries of all the 2r e-triangles of the following sort. If d; lies on a;, and 
ON two of the triangles are and @:@%41. The result is an 
(e, 1)-cycle pK’ =0 or K’. Thus K,;e~0 or K’, and the proof is complete. 

An immediate consequence of this lemma is 


Lemma J. Let the simple closed curve J lie in the closed set R. If for every 
e>0 there is a 1-cycle K' in J as above described which is e~0 in R, then J~0 
in R. 


Lemma K. If y is an arc, then for every e>0 there is a 5>0 such that any 
(6, 1)-cycle on y is e~0 ony. 

The proof below holds in fact if y is a closed k-cell, any &. It is sufficient 
to prove it for the case that y is a closed line segment, in which case we can 
take 6=€/2.+ 

Let K be a (6, 1)-cycle on y, let abo be a segment of K, and say 6(y) =a. 
Choose a fixed point p in y, and an integer m>a/5. Let the vertices ay, 
dz, , divide the segment aop into m equal parts, and similarly for the 
vertices b, be, - - - , bn,-1. Add to K the boundaries of all triangles of the form 
Gn_1b,-1p, and of all similar triangles corresponding to the 
other segments of K. The result is 0. As all the triangles employed are e- 
triangles, Ke~0 in y. 

5. Outline of the proof of Theorem I. The proof runs as follows. 


1 The essential point in the proof below is that y is convex: any two points of y are end points 
of a line segment in y. The proof is then easily extended to the case of any set homeomorphic with y. 


it 


266 HASSLER WHITNEY [January 


(a) In §6 we show how an arc y can be drawn in R crossing J,} avoiding 
two given closed sets. R—v¥ is not connected. 

(b) In §7 we prove some lemmas. These show (§8) that R—vy contains 
exactly two components A’ and B’. If A=A’++¥, then A and its boundary 
curve J, (which is y plus a part of J) satisfy condition (1) of the theorem; 
similarly for B = B’+~ and Jz. Further, A and B are continuous curves. 

(c) In §9 it is shown that any arc in A (or B) crossing J4 (Jz) divides 
A (B). Thus A and J, (B and Jz) satisfy all the conditions of the theorem. 
Hence we can cut up each set just as we cut up R, and can continue indefi- 
nitely. 

(d) The object of §10 is to prove that R may be cut into pieces of arbi- 
trarily small diameter. 

(e) The homeomorphism between R and R’ is now easily established. We 
cut R up indefinitely, and cut R’ in a corresponding fashion. Any point p of R 
lies in a descending sequence of pieces; the corresponding sequence in R’ 
determines a point p’, which we let correspond to p. 

We turn now to the detailed proof. 

6. An arc crossing J. We prove here 


Lema L.f} Let the simple closed curve J be ~0 in the continuous curve R. 
Let c and d be two points of J, dividing J into the two arcs n; and n2. If C and D 
are two closed sets in R containing c and d respectively, and C:D=0, then there 
is an arc y in R joining n, to n2 which has no points in C or in D. 

Say p(C, D) =3e, and put C’=W.(C), D’=W.(D); then p(C’, D’) =e. 
Take o so small that any two points in R within o of each other are joined by 
an arc of diameter <«¢ (Lemma D). Take 6 so small that any (6, 1)-cycle on 
J is c~0 in R. Construct the (6, 1)-cycle K +€nd+dd, 
+dido+ - ++ +d,c, d;¢ m2. There is a (, 2)-chain 


in R, where we let Lc contain all those triangles of Z with vertices in C’, and 
let Lp be the rest of L. 
Say 


Le > Ke = Ké + K*, 
where we let K¢ contain all those segments of Ke which are also in K. As 


Lec V,(C’), K*-D’=0. Define Ky’ by the relation 


t That is, y lies in R, and has only its end points on J. 
t Compare P. Urysohn, Uber Riume mit verschwindender erster Brouwerscher Zahl, Proceedings, 
Amsterdam Akademie van Wetenschappen, vol. 31 (1928), pp. 808-810. 


CLOSED 2-CELLS 
Lp — Kp = Kp + K*. 
Adding these relations gives Z on the left, and hence K on the right: 
K=Kd+Ko. 
As all the segments of K¢’ are in K, Ky must contain just those segments of 
K not in K¢ ; in particular, it contains no segments of K*. Hence all the seg- 


ments of K* are present in Ky’ + K*, the boundary of Lp (i.e. none have can- 
celed out with segments of Ky’). Hence, as Lp-C’ =0, 


K*-C’ = K*-D! = 0. 
As Kc is the boundary of Le, it is a 1-cycle; hence 
Ket+co = Keo + K* 


By Lemma G, Ke+cc; contains a chain of segments joining c; to c. Following 
this chain, let », be the first vertex in 2, and fo, the last vertex before p, 
in m, and say pop:, pipo, - - - , Ps-1Ps are the segments in between. We shall 
show that these segments are in K*. If s>1 this is obvious, as then fi, - - - 
ps1 exist and are not on J. Suppose s=1 and pop: is not in K*; then it is in 
Ké +cc;. It could only be the segment cc;. But cc, lies in K and not in Kp, 
hence it is in Kg; it is not in K*, hence it is in Ke+K*=K¢’, and therefore 
not in K¢ +cc;. This proves the statement. 

Now let p:Pi4: be an arc of diameter <ein R,i=0, - - - ,s—1. These arcs 
form a continuous curve, from which we can pick out an arc y (Lemma B) 
joining 7, to 72; we can take ¥ so only its end points are on J. As pipiz1 ¢ K* 
and 6(pipis1) <€, y has no points in C or in D, and the lemma is proved. 

7. We prove three lemmas. 


Lemma M. If J¢C, J~0 in C+D, and C-D=an arc y, then J~0 in C. 


Given an e>0O, choose first €: so small that any (3«,, 1)-cycle on y is 
e~0 in y (Lemma K). Take next e:< , so that if p¢ D and p(p, C) <e, then 
p(p, <a. (If Di=D—D-V.,(y), take €<p(D:, C).) Take finally 5<« so 
that any (6, 1)-cycle K on J is ¢.~0 in C+D; we shall show that Ke~0 in C. 

Let L—K be an (¢, 2)-chain in C+D. Take any vertex p of Lin D- V.,(C) 
—v, and replace it by a vertex p’ cy, where p(p, p’) <a. L is thus replaced 
by a (3«:,, 2)-chain L’, in which each triangle lies wholly in either C or D. 
Moreover, L’—K, as no vertices of K have been moved. 

Put L’=Lc+Lp, where L¢ contains those triangles of L’ in C. Say 


Ie — K + K*; then Lp > K*. 


K* is a (3e, 1)-cycle lying in C-D=y; it bounds an (e, 2)-chain L* in y. 


Hence 


if 

. 

i 

i 

| 

in 

| 


268 HASSLER WHITNEY [January 


L*—(K + K*) + K* = K. 
Le+L* is an (e, 2)-chain in C, and the lemma is proved. 


Lemma N. Let A-B=vy, an arc whose end points are a and b. Let the arcs 
a and B join a and b in A and B respectively, neither having any points other 
than a and b in common with y. If a+8~0 in A+B, thena+y~0 in A. 


Given an e>0, choose ¢:, ¢: and 6 as in the last lemma. Take (4, 1)-chains 
K., Kz; and K, in a, 8 and y respectively, each bounded by a+); by Lemma J, 
it is sufficient to show that K.+K, e~0 in A. 

K.+Kgz bounds an (¢€, 2)-chain L in A+B; we move each vertex of L 
in B-V.,(A)—y onto y, giving a (3«, 2)-chain L’>~K.+Kg. Say L’=La 
where LacA, LgcB. If La—>K.+K*, then Ls—-Kj+K*, and 
K* cy. K*+K, is a (3, 1)-cycle on y bounding an (e, 2)-chain L* in y. 
Hence L4+L*—K.+K, in A, completing the proof. 


Lemma O. Let a, 8B and y be three arcs such that a-B=a-y=B-y=atb. 
Saya+ycA and B+ 7cB. Ifat+y~0 in A and B+y~0 in B, thenat+B~0 
in A+B. 


Define K., Ks, K, as before; we need merely show that K.+K,e~0 in 
A+B. There are (e, 2)-chains Z4 and Lz such that L4a—K.+K, in A and 
in B; hence La+Le—-K.+K; in A+B. 

8. The set R—y. Let y be any arc in R crossing J; say the end points of 
y divide J into the two arcs a and £8. By condition (2) of the theorem, R—y 
is not connected. Let A’ and B’ be those components of R—y containing 
<a>f and <> respectively. These are not the same component. For if 
they were, putting A = A’+y7,D = R— A’, wehave Jc A, J~0in R=A+D, 
and A-D=v7; hence, by Lemma M, J~0 in A, a proper subset of R, contrary 
to condition (1) of the theorem. 

The same reasoning shows that R has no cut point p; we need merely re- 
place y by pin Lemma M and above. 

Put 

If D=R—A’, then A-D=y and J =a+8~0in R=A+D. Hence, by Lemma 
N, a+y~0 in A. Similarly, 8+y~0 in B. Consequently, by Lemma O, 
J~0 in A+B, from which follows that A+B=R. 

Moreover, a+7 is irreducibly ~0 in A. For if a+y~0 in A*, a+y¢ A* 
c A, then, by Lemma O, a+8~0 in A*+B; hence A*+B=R, which is only 
possible if A* = A. Similarly, 8+7 is irreducibly ~0 in B. 


1 <a> is a except for its end points, etc. 


| 
rit 
— 


1933] CLOSED 2-CELLS 269 


Let us show that A is a continuous curve. It is connected, as A’ is; it is 
self-compact, being a closed subset of a compact space. A is locally connected. 
For if p and q are points of A close enough together, there is an arc pg in R 
of small diameter; if pg lies partly in B’, we can replace that part of it by an 
arc of y of small diameter. Lemma A now applies. Similarly, B is a con- 
tinuous curve. 

9. We shall now show that any arc 6 crossing J4 =a+7¥ in A divides A. 
The following two lemmas will be useful. 


Lemma P. If and m2 are arcs contained within the arcs y and B respectively, 
then there is an arc pq crossing Jp in B, with pcm, 


This is an immediate consequence of Lemma L, if we take, for the closed 
sets of that lemma, the closed intervals of Jz complementary to m and 7p. 


Lemma Q. There are no two arcs ab and cd in R without common points, each 
crossing J, whose end points are in the order achd on J. 


This follows directly from what we have seen above. 

To show that 6 divides A, we must consider four cases. 

Case 1. Both end points of 6 lie on a. Suppose A —4 is connected; then 
it is arcwise connected, by Lemma E. Hence there is an arc in A —6 joining a 
point » of a lying between the two end points of 6 and a point q within y. If 
m is an arc within ¥ containing g, there is an arc rs in B joining m to a point 
s within 8, with only its end points r and s on Jz, by Lemma P. The arc 
pars crosses J and does not touch 6. But the end points of this arc alternate 
with those of 5 on J, contradicting Lemma Q. 

Case 2. 6 is an arc cd, where ¢ lies within a, d lies within y. If A —6 is con- 
nected, let pg be an arc in this set joining points of a on opposite sides of c. 
If m, is an arc of y containing d but not touching pg, let the arc rs join m to 
8 in B; then the arcs pg and cdrs contradict Lemma Q. 

Case 3. The end points c and d of 6 lie within y=ab, say in the order 
acdb. If A —4 is connected, let pq be an arc in this set joining a point within 
a to a point q in y between c and d. If is an arc of y containing g but not 
touching 4, let 7:5, be an arc in B joining 7; to a point s; within 8. 

The arcs acr,; of y and rs; form an arc acr;s; crossing J; hence 


R- acr\s; = Ci + Co, 


where C; contains the open arc <as,> of 8, and C; contains b and points con- 
nected with b. As r,s; lies in B, A’ ¢ C:+C2; the connected set A’+6 lies thus 
in C2. If yz is an arc of y containing ¢c but not touching m, and 7252 is an arc in 
C, joining 72 to a point sz of 8 between a and s;, then 42+7252 does not touch 
pgris:, and has only the point c in common with 6. 


f 
i 
4 
4 
| 
i 
| 


270 HASSLER WHITNEY [January 


Similarly, if 73 is an arc of y containing d but not touching m, there is an 
arc 7353 in R—bdr,s; such that r; lies in 73, s3 lies in 8 between s; and b, and 
Ns +1353 does not touch pgns; and has only the point d in common with 6. 
The arc 7353 does not touch roS2, as it lies in C2, Thus the two arcs pgns; and 
Serecdr3s; (cd =5) contradict Lemma Q. 

Case 4. The same as Case 3, except that c=a or d=), say the latter. 
Then, in the notation of Case 3, the arcs paris; and serech (ch=85) contradict 
Lemma Q. 

This completes the proof that A and J, (B and Jz) satisfy the conditions 
of Theorem I. 

10. The cutting up of R. We are concerned with the following lemma. 


LremMA R. R may be cut into a finite number of pieces of arbitrarily small 
diameter. 


Given an e>0, choose 6<e so as to satisfy the requirement in Lemma F. 
Suppose R is cut up so that the diameter of the boundary of each piece is 
<6. Then each piece is of diameter <3e. For otherwise there is a point qg of 
some piece R; at a distance 2¢€ from its boundary J;. Let p be a point of 
J;, and q’, a point of R—R; at a distance = ¢ from p. Every arc from q to q’ 
must cut the boundary J; of R; and thus must pass within 6 of p, contra- 
dicting Lemma F. é 

The lemma thus follows from 


Lema S. Given a 6>0, R can be cut up so that the diameter of the boundary 
of each piece is <6. 

Express R as the union of a finite number of continua: 

< 8/2. 
We shall cut up R in such a manner that no two of these continua K; and K; 
have points on the boundary of the same piece of R, if K;-K;=0; the lemma 
will then follow. 

Suppose we have cut R up a certain amount (perhaps not yet at all), into 
the pieces Ri, Ro, - - - , with boundaries Ji, Jz, - - - , Jn (we may have R 
and J alone). Of course each boundary J; separates R; from the rest of R. Take 
any two continua, say K, and Ke, with K,-K.=0, each of which has points 
on one of these J;, say J;. We shall cut R up further so that in the new pieces 
there is no one (i.e. no piece, not merely no boundary of a piece) which has any 
points in common with both K, and K2; then on any further cutting up of R, 
this will still be true. 

Divide the points of J; into three sets, as follows. We put a point x into 
the first set if it lies in K,, or if following J; in both directions we reach points 


1933] CLOSED 2-CELLS 271 


of K, before reaching points of Kz; we put x into the second set if the same 
conditions hold with K, and K, interchanged; all other points we put into the 
third set. This set Lj consists of open intervals of J:, each being bounded by 
a point of K, on one end and a point of Kz on the other. The points of the 
first set together with the points K,-R, form a closed set Z;, and those of the 
second set together with K2-R: form a closed set Ly. Then p(Li, L:)>0 as 
L,-I,=0, from which follows that there are but a finite number of intervals 
in Lj. As Ki is connected, each component of L; has points on J;, and thus on 
one of the intervals Z; of J; complementary to the intervals of LZ; . Thus there 
are a finite number of components Ln, Lis, - - - , Lim, in Li. Similarly there 
are a finite number of components La, L22, - - - , Lom, in Le. 

We shall now cut R; into a number of pieces, in each of which either K; has 
no points or Ke has no points. Suppose Lz, ---, Lam, and Ls, Lam; 
are the intervals of ZL; and L; respectively, and say they lie in the order Zz, 
, Lse, Lad, - » Lam,, Lam, on Jy. If we go around Ji, the intervals of lie 
alternately in Z; and LZ». Starting at Z3:, which lies in Zu say, go around J; till 
we reach another interval Z;, in Ly, (we may have gotten back to Zz:). Put 
Lo, L3,x-1 and all of J; between these into a set M? (which may be L;: alone), 
and put Z;;, Ls:, and all of J; between these on the other side from Lz: into a 
set M/ (which may be Lz; alone). Z3/ and L;,,-; are the two intervals of Ji 
complementary to My and MZ. 

No set Li; or Le; has points in both M/ and M72. This follows for Zu by 
construction. If it were false for some other set, say Z;,, then Z;, would have 
points on two intervals Z;, and L;, separated by Z3, and L3, on J:. Now 
Lu-Li.=0, hence p(Lu, L;,)>0. As R; is a continuous curve, there are con- 
tinuous curves and Z;,* in R; containing and Z;, and such that 
Lu*-L1,.*=0 (see Lemma C). These sets are arcwise connected, and we can 
draw arcs contradicting Lemma Q. 

Let M, be My plus all components Z;; and 2; containing points of My, 
and define M; similarly. Then M, and M; are closed, M,-M;=0, and M,+-M, 
> By Lemma L we can draw an arc ¥; from Z;/ to which has 
no points in M, or in Mz. R,; is thus cut into two pieces, in each of which there 
is at least one component Ji; or Z2;; for one contains Z,, and the other con- 
tains that Z2; containing L3:. Thus in each piece there are less than m,-+m, 
components, the number in Ri. 

If one of the resulting pieces contains more than one component, we cut 
it up, etc. Finally each new piece of R, has points of only one component, and 
thus K, and Ke are separated in R;. We now separate K, and Kz in each 
other piece R; of R also. This is possible, for if K; ({=1, 2) has points in any 
R;, it also has points on J;. 


4 

4 
t 


272 HASSLER WHITNEY [January 


If now there are any other two of the continua K;and K;, K;-K;=0, each 
of which has points on some new J;, we cut R further till this is no longer 
true, etc. This completes the proof. 

11. The homeomorphism. Cut R into pieces of diameter < some o. We 
make corresponding cuts in R’ as follows. The first arc 7 drawn in R cuts R 
into the two pieces R; and R2 with boundaries J; and Jz say. Draw any arc 
7’ crossing J’ in R’, cutting R’ into the pieces R/ and R/ with boundaries 
J{ and J/. We note that J/+J/ is homeomorphic with J:+J2, with J? 
corresponding to J;, k=1, 2. Say y:is an arc in R, cutting R, into pieces Ru 
and Ry» with boundaries J; and Jy. If a; and 0}; are the end points of 7, let 
a,* and a,* be the corresponding points of J/ in the above homeomorphism. 
Draw an arc 7 crossing Jj in R/ , with end points a/ and b/ close to a,* and 
b,* respectively (Lemma P); Ry is divided thereby into the pieces Ri/ and 
Ri with boundaries Ji) and Ji. Moreover, is homeomor- 
phic with Ji:+Ji2+J2, with boundaries with the same subscripts correspond- 
ing. 

In general, suppose R;,;,...:,, is a piece that is present after R is cut a cer- 
tain amount, and say the arc ¥;,...;,, divides this set into the pieces Rj,...:,1 
and Rj,...:,2, With boundaries J;,...:,1 and Ji,...:,2 If and 
b;,...i, are the end points of ¥;,...:,,, let a:,*...;,, and 0;,*...;,, be the correspond- 
ing points on Jj,...;,, in the homeomorphism we have already. Draw an arc 
Vi! ...i, crossing J;/ ...;,,, with end points a;; ... ;,, and ...;,, close to the 
above points, dividing R;/ ...;,, into the pieces R;/ ...:,1 and R;! ...:,2, with 
boundaries J;/ ...;,1 and J;/ ...:,2. The set of boundaries with primes is now 
homeomorphic with the set of boundaries without primes, boundaries with 
the same subscripts corresponding. We note that if R;,...:,, and Rj,...;, have 
common points, then Ri! ...i, and R;! ...j, have common points, and conversely. 

Having cut R into pieces of diameter <o and having cut R’ in a cor- 
responding fashion, we now cut each piece of R’ into pieces of diameter 
<a/2 and cut each piece of R in a corresponding fashion. Next we cut each 
resulting piece of R into pieces of diameter <a/4, etc. Now for any e>0 
there is an m such that 


<6 


for any m-fold subscript. 

We now establish the homeomorphism between R and R’. Let p be any 
point of R. It lies in either R; or R2 (perhaps in both), say in R;,. Then it lies 
in either R;,, or R;,2 (perhaps in both), say in R;,:,, etc. Thus we have a se- 
quence of pieces 


1933] CLOSED 2-CELLS 


R32 Ri, > Rig DD. 


The corresponding pieces in R’ have a single limit point: 


R' > Ri, > Rig > 


This point p’ we let correspond to p. 

If there are different sequences of pieces in R containing p, we have dif- 
ferent sequences in R’ defining points p’. However, all these points p’ are the 
same. For if R, R:,, Ri,i,,-- and R, R;,, +, are two sequences 
containing ~, then each piece R;,...;,, has points in common with Rj,...;,, 
namely, the point p; hence, as we saw above, R;; ...;,, and R;/...;,, have 
common points. Thus the corresponding sequences in R’ close down on a 
single point. Similarly, to each point p’ in R’ corresponds a single point pin R. 

Finally, the correspondence is continuous. For take a point p in R and an 
e>0. Let p’ be the corresponding point in R’, and choose an m so that 
5(R;! ...:,,.) <¢ for all m-fold subscripts. Consider all the R;,...;,, with m-fold 
subscripts which contain »; these include all points of R in some V;,(). 
Then if g ¢ V;(p), the corresponding point q’ is in V.(p’), and the continuity 
is established. This completes the proof of Theorem I. 

12. Proof of Theorem II. Let J be a circle in the plane, and let S be J plus 
its interior. S is self-compact, connected and locally connected, and is thus a 
continuous curve. That J~0 in S follows from Lemma K.+ 

To show that J is irreducibly ~0 in S, suppose that J~0 in S’, a proper 
closed subset of S; we can suppose that S’ is a continuous curve. Let p be a 
point of S not in S’, and let V;(p) have no points in S’. Let ab be a segment 
of a straight line passing through / with its ends on J. Let a,b; and abe be 
parallel segments enclosing ab, and lying at a distance 6 from ab. Then in 
that portion of S’ between a:b; and azbe, the (short) arcs a:a2 and bib, are not 
connected. But if C and D are those parts of S’ outside a,b; and a2b2, by Lemma 
L we can draw an arc joining a,d2 to bib, in S’—(C+D), a contradiction. 

Finally, that an arc crossing J in S divides S is a special (and easily 
proved) case of the Jordan theorem. This completes the proof. 

13. The Jordan theorem. Let J be a simple closed curve in the plane. Let 
I be a circle containing J in its interior. Draw two non-intersecting line 
segments from J to J. S=J plus its interior is thus cut into three closed 2-cells, 
one of which, say R, has the boundary J. Then R—J is the inside of J. The 
points of J are obviously accessible from either side. 


t For S is a closed 2-cell. 


PRINCETON UNIVERSITY, 
PRINCETON, N. J. 


273 
| 
| 
| 
} 


NEW SETS OF INDEPENDENT POSTULATES FOR THE 
ALGEBRA OF LOGIC, WITH SPECIAL REFERENCE 
TO WHITEHEAD AND RUSSELL’S PRINCIPIA 
MATHEMATICA* 


BY 
EDWARD V. HUNTINGTON 


INTRODUCTION 


Three sets of independent postulates for the algebra of logic, or Boolean 
algebra, were published by the present writer in 1904. The first set, based on 
the treatment in Whitehead’s Universal Algebra, is expressed in terms of (K, 
+, X), where K is a class of undefined elements, a, b, c, - - - , and a+ and 
aXb are the results of two undefined binary operations. The second set is 
expressed in terms of (K, <), where a<d is an undefined binary relation be- 


* Presented to the Society, December 28, 1931, and September 2 and October 29, 1932; received 
by the editors June 27, 1932. A brief bibliography of postulates for Boolean algebra, which makes 
no pretence of being complete, is as follows: 

E. Schréder, Algebra der Logik. Leipzig, Teubner, 1890. 

A. N. Whitehead, Universal Algebra. Cambridge University Press, 1898. 

E. V. Huntington, Sets of independent postulates for the algebra of logic. These Transactions, 
vol. 5 (1904), pp. 288-309. 

E. Schréder, Abriss der Algebra der Logik. Leipzig, Teubner, 1909-1910. 

A. Del Re, Sulla indipendenza dei postulati della logica. Rendiconto, Accademia delle Scienze, 
Naples, (3), vol. 17 (1911), pp. 450-458. 

H. M. Sheffer, A set of five independent postulates for Boolean algebra, with application to logical 
constants. These Transactions, vol. 14 (1913), pp. 481-488. 

B. A. Bernstein, A complete set of postulates for the logic of classes expressed in terms of the opera- 
tion “exception,” and a proof of the independence of a set of postulates due to Del Re. University of 
California Publications on Mathematics, vol. 1 (1914), pp. 87-96. 

L. L. Dines, Complete existential theory of Sheffer’s postulates for Boolean algebras. Bulletin of the 
American Mathematical Society, vol. 21 (1915), pp. 183-188. 

. B.A. Bernstein, A set of four independent postulates for Boolean algebra. These Transactions, 
vol. 17 (1916), pp. 50-51. 

B. A. Bernstein, A simplification of the Whitehead-Huntington set of postulates for Boolean algebras. 
Bulletin of the American Mathematical Society, vol. 22 (1916), pp. 458-459. 

J. G. P. Nicod, A reduction in the number of the primitive propositions of logic. Proceedings of the 
Cambridge Philosophical Society, vol. 19 (1917), pp. 32-41. 

N. Wiener, Certain formal invariances in Boolean algebras. These Transactions, vol. 18 (1917), 
pp. 65-72. 

C. I. Lewis, A Survey of Symbolic Logic. University of California Press, 1918. 

H. M. Sheffer, Review of C. I. Lewis’s “A Survey of Symbolic Logic.” American Mathematical 
Monthly, vol. 27 (1920), pp. 309-311. 


274 


POSTULATES FOR THE ALGEBRA OF LOGIC 275 


tween the elements a and b. The third set is expressed in terms of (K, +), or, 
if one prefers, in terms of (K, X). 

If the class KX is finite, it is well known that the number of elements must 
be some power of 2; and any class consisting of 2, 4, 8, 16, - - - elements can 
be made into a Boolean algebra by properly defining + and xX. 

Every Boolean algebra contains a “zero element,” z, such that a+z=a, 
and a “universe element,” v, such that aXu=a; and each element a deter- 
mines an element a’, called the “negative” of a, such that a+a’=v and 
aXa’ =z. 

In 1913, H. M. Sheffer published a set of postulates for the same algebra 
expressed in terms of (K, |), where the “stroke,” |, represents another binary 
operation, called “rejection,” such that a |b=(a+)’. 


A. N. Whitehead and B. Russell, Principia Mathematica, second edition. Cambridge University 
Press, vol. 1, 1925. 

H. M. Sheffer, Review of “Principia Mathematica.” Isis, Quarterly organ of the History of Science 
Society, vol. 8(I) (1926), pp. 226-231. 

Paul Bernays, Axiomatische Untersuchung des Aussagen-Kalkiils der “Principia Mathematica.” 
Mathematische Zeitschrift, vol. 25 (1926), pp. 305-320. 

B. A. Bernstein, Sets of postulates for the logic of propositions. These Transactions, vol. 28 (1926), 
pp. 472-478. 

D. Hilbert and W. Ackermann, Grundziige der theoretischen Logik. Berlin, 1928. 

Alfred Tarski, Fundamentale Begriffe der Methodologie der deduktiven Wissenschaften. I. Monats- 
hefte fiir Mathematik und Physik, vol. 37 (1930), pp. 1-44. 

Kurt Gédel, Die Vollstindigkeit der Axiome des logischen Funktionenkalkiils. Monatshefte, vol. 
37 (1930), pp. 349-360. 

Kurt Gédel, Uber formal unentscheidbare Sdtze der Principia Mathematica und verwandter 
Systeme. I. Monatshefte, vol. 38 (1931), pp. 173-198. 

J. Lukasiewicz and A. Tarski, Untersuchungen tiber den Aussagenkalkiil. Comptes Rendus des 
Séances de la Société des Sciences et des Lettres de Varsovie, vol. 23 (1930), Class III, pp. 1-21. 

J. Lukasiewicz, Philosophische Bemerkungen zu mehrwertigen Systemen des Aussagenkalkiils. 
Ibid., pp. 51-77. 

A. Heyting, Die formalen Regeln der intuitistischen Logik. Sitzungsberichte der preussischen 
Akademie der Wissenschaften (Berlin), Jahrgang 1930, Physikalisch-Mathematische Klasse, pp. 
42-56; 57-71; 158-169. 

B. A. Bernstein, Whitehead and Russell’s theory of deduction as a mathematical science. Bulletin 
of the American Mathematical Society, vol. 37 (1931), pp. 480-488. 

J¢érgen Jérgensen, A Treatise of Formal Logic. Oxford University Press, 3 vols., 1931. 

E. V. Huntington, A new set of independent postulates for the algebra of logic with special reference 
to Whitehead and Russell’s Principia Mathematica. (This brief abstract includes the “fourth set” in 
the present paper and one other set of a different character.) Proceedings of the National Academy 
of Sciences, vol. 18 (1932), pp. 179-180. 

P. Henle, The independence of the postulates of logic. Bulletin of the American Mathematical 
Society, vol. 38 (1932), pp. 409-414. 

B. A. Bernstein, On proposition *4.78 of Principia Mathematica, Bulletin of the American 
Mathematical Society, vol. 38 (1932), pp. 388-391. 

C. I. Lewis and C. H. Langford, Symbolic Logic. New York, The Century Company, 1932. 


. { 
{ 
i 


276 E. V. HUNTINGTON [January 


In 1914, B. A. Bernstein gave a set in terms of (K, —), where the “—” 
represents another binary operation called “exception,” such that a—d 
=a Xb’; and also a set in terms of (K, +), where the “+” indicates a binary 
operation called “adjunction,” such that a+b=a+0’. 

In the meantime, the primitive propositions of Section A of the Principia 
Mathematica (1910) were expressed in terms of a class called the class of 
“elementary propositions,” a binary operation called “disjunction,” and a 
unary operation called “negation” ; and Bernstein has recently shown (June, 
1931) how these primitive propositions can be expressed in abstract mathe- 
matical form in terms of (K, +, ’). Since the relation between the theory of 
the Principia and the theory of Boolean algebra has been the subject of some 
discussion, it becomes a matter of interest to construct a set of independent 
postulates for Boolean algebra explicitly in terms of (K, +, ’), for comparison 
with the Principia. 

The present paper contains several such sets, numbered in such a way as 
to avoid confusion with the first, second, and third sets of 1904. 

The fourth set, containing six postulates, appears to be the simplest and 
most “natural” of all the sets of postulates for Boolean algebra. It contains 
no “existence” postulate. 

The fifth set, suggested by Sheffer’s set of 1913, is shorter by one postulate, 
but appears decidedly more “artificial” than the fourth set. 

The sixth set is modeled after the Principia-Bernstein set, with the addi- 
tion of an extra postulate which proves to be necessary to make the list suffi- 
cient for Boolean algebra. This set also appears artificial and complicated in 
comparison with the fourth set. 

All three of these sets are expressed in terms of (K, +, ’); but since in all 
these sets (following the usual mathematical custom) tacit use is made of the 
equality sign, it is more accurate to say that all these sets are expressed in 
terms of (K, +, ’, =). 

In the present paper, the rules governing the use of the equality sign are 
listed in explicit form as Postulates A, B, C, D. Such an explicit statement of 
the postulates governing the sign =is essential to any satisfactory comparison 
between Boolean algebra and the Principia. 

For, an outstanding feature of the Principia is that no postulates for =are 
presupposed. The primitive propositions of the Principia do not contain the 
equality sign, and the development of the theory proceeds without the use of 
Postulates A, B, C, D. Instead, a symbol = is introduced by definition, and 
Postulates A, B, C, D (with = written in place of =) are supposed to be 
deduced as theorems. 

It appears, however, that the desired properties of the sign =, as de- 


1933] POSTULATES FOR THE ALGEBRA OF LOGIC 277 


scribed in the informal part of the Principia, cannot be rigorously deduced 
from the formal list of primitive propositions and the formal definition of = 
in the Principia, without the use of some additional postulates. 

In Appendix I of the present paper, the connection between a Boolean 
system (K, +, ’, =) and the Principia system (K, +, ’, =) is explained; 
and in Appendix II a revised list of primitive propositions for the Principia 
is given. 

The resulting expression of the Principia’s system in strictly postula- 
tional form is believed to be free from the objections which might be raised 
against any formulation (like Bernstein’s of June, 1931) which pre-supposes 
the use of the equality sign. 

The new set of postulates for the Principia are shown to be “consistent” 
and “independent” by the same methods that apply to any other set of 
mathematical postulates. 


THE FIRST SET (1904) 


For convenience of reference, the postulates of the “first set” for Boolean 
algebra, which are expressed in terms of K, +, X, are here reproduced, in 
abbreviated form, with the original numbering. (The original A, V , and dare 
here replaced by z, uv, and a’; and the circles around the + and X are 


omitted.) 
Ia. If a and b are in the class K, then a+b is in the class K. 
Ib. If a and b are in the class K, then ab is in the class K. 
Ila. There is an element z such that a+z=a for every element a. 
IIb. There is an element vu such that av=a for every element a. 
IIIa. a+b=b+a. 
IIIb. ab =ba. 
IVa. a+bc =(a+b)(a+c). 
IVb. a(6+c) =ab+ac. 
V. For each element a there is an element a’ such that a+a’=vu and 
aa’ =z. 
VI. There are at least two distinct elements in the class K. 
From these postulates the following theorems are deduced in the paper 
cited. 
Vila. The z in Ila is unique. VIIb. The v in IIb is unique. 
Villa. a+a=a. VIIIb. aa =a. 
.a+u=v. IXb. az=z. 
. a+ab=a. Xb. a(a+d) =a. 
The element a’ in V is uniquely determined by a. 
.a+b=(a'd’)’. XIIb. ab =(a’+b’)’. 


278 E. V. HUNTINGTON [January 


XIITa. (a+b) +c=a+(b+c). XIIIb. (ab)c = a(bc). 
XIV. The relation a<b is defined by any one of the following equations: 
a+b=b; ab=a; a’+b=u; ab’ =z. 

Concerning the relation < we have the following theorems, 2.1—2.9, which 
correspond to the postulates 1-9 of the “second set” in the paper of 1904. 

2.1. a<a. 

2.2. If a<b and b<a, then a=b. 

2.3. If a<b and b<c, then a<c. 

2.4. z<a (where z is the element in Ila and VIIa). 

2.5. a<u (where vu is the element in IIb and VIIb). 

2.6. a<a+b; and if a<y and b<y, then a+b<y. 

2.7. ab<a; and if x<a and x <b, then x <ab. 

2.8. If x<a and x <a’, then x=z; and if a<y and a’ <y, then y=uv. 

2.9. If a<b’ is false, then there is at least one element x, distinct from z, such 
that x <a and x <b. 


EXAMPLES OF BOOLEAN ALGEBRAS* 


The most familiar example of a Boolean algebra is the following: 

K =the class of regions in a square (including the null region, and 
the whole square) ; 

a+b=the smallest region which includes both a and 3; 

a’ =the region complementary to a with respect to the square; 
ab =the region common to a and 6. 

Here the relation a<b means “a is included in b.” 

Another interesting example is the following, given by H. M. Sheffer in 
his review of C. I. Lewis’s A Survey of Symbolic Logic (American Mathe- 
matical Monthly, vol. 27 (1920), p. 310): 

K =a class of eight numbers, 1, 2, 3, 5, 6, 10, 15, 30; 

a+b=the least common multiple of a and 6; 

a’ =30/a; 
ab =the highest common factor of a and b. 

Here the relation a<b means “a is a factor of b.” 

Or, in general, let K =the class of 2” numbers which are the factors of any 
Boolean integer, v (“Boolean integer” being the name given by Sheffer to any 
integer which contains no square factor); with a+, a’, and ab defined as 
illustrated above for the case uv = 30. 

Another example for eight elements is the following: 


* The name Boolean algebra (or Boolean “algebras”) for the calculus originated by Boole, ex- 
tended by Schréder, and perfected by Whitehead seems to have been first suggested by Sheffer, in 
1913. 


POSTULATES FOR THE ALGEBRA OF LOGIC 279 
K =a class of eight numbers: 0; 2, 3, 4; 23, 24, 34; and 234 (=v); 
a+b, a’ and ab being defined as in the accompanying tables. 


3 24) 4 23 0 234 | 2 34 | 3 24 | 4 23 


0 


4 23 0 0/0 0/0 


234 2 34/3 24 


23 3 


34 4 


4 34 4 


23 234 | 23 234] 23 234 |234 23 


It will be observed that the digits in a+ include the digits in a and also 
the digits in b (0 not counting as a digit); and the digits in ab are the digits 
common to aand b. Hence the commutative, associative and distributive laws 
are seen at once to be true. Also, the numbers 0 and 234 are seen to serve as 
the elements z and v. By the same process, we can readily construct an ex- 
ample for 2” elements, where m is any integer. 

The tables for four elements are conveniently written as follows: 


0 2 : x || 0 2 


0 
1 
2 
3 


0 2 
1 1 
2 2 
3 1 


(These tables are the same as the upper left hand quarters of the tables 
for eight elements, the digit 4 being dropped, and the universe element 234 
being represented by 1.) 


POSTULATES GOVERNING THE USE OF THE EQUALITY SIGN 


The postulates of the fourth, fifth, and sixth sets are expressed in terms of 
the undefined concepts (K, +, ’), the first two postulates in each set being 
the following: 

PostuLatE 1. Jf a and b are in the class K, then a+b is in the class K; 


+ || 0 234 | 2 34 | 
O 234] 2 34] 3 24 | 
234 ||234 234 [234 234 | 234 234 |234 4 23 
2|| 2 234] 2 234] 23 24| 24 =i. 0 2 
34 || 34 234 |234 34] 34 234| 34 234/] 2 34 34/0 4 3 
3 || 3 2341/23 34] 3 234/34 24 3/10 310 3/13 3 
24 || 24 234| 24 234 24 234/| 3 24/2 4/0 24/4 2 
4 234 || 23 4110 410 4/0 4/4 0O 

23 4 23 ||0 23 

+ 3 

0 31 | 0000 

1 1 || 0 | 012 3 

2 1 || 3 022 0 

3 3 || 2 | 03 0 3 


280 E. V. HUNTINGTON [January 


PostuLate 2. If a is in the class K, then a’ is in the class K; 
and in each of these sets (following the usual mathematical procedure), the 
use of the equality sign, =, is taken for granted. 

If preferred, however, the equality sign itself may be regarded as an ad- 
ditional undefined concept, provided suitable postulates are laid down 
governing its use. 

An obvious set of postulates for = is as follows, where a, b, c, - - - are 
understood to be elements of the class K. 

PostutateE A. Jf a is in the class K, then a=a. 

PostuLaTE B. If then b=a. 

PostuLAaTE C. If a=b and b=c, then a=c. 

PostuLate D. If x=y, then f(x, a, b, c,--- )=f(y, a, b,c, -- +), where 
f(x, a, b,c, - - +) is any element of the class K built up from the elements x, a, 
b, c, - - - by successive applications of the operators + and ’ (see Postulates 1 
and 2), and f(y, a, b, c, - - - ) is the element obtained from f(x, a, b,c, - - - ) by 
writing y in place of x throughout. 

If these postulates A, B, C, D are added, the fourth, fifth, and sixth sets 
of postulates may be said to express Boolean algebra in terms of the four 
undefined concepts (K, +, ’, =). 


THE FOURTH SET 


The following set of independent postulates for Boolean algebra is ex- 
pressed in terms of (K, +, ’). K is a class of elements a, b, c,---; a+b 
denotes the result of a binary operation called logical addition; and a’ denotes 
the result of a unary operation called logical negation. (A trivial preliminary 
postulate 4.0, demanding that the class K shall contain at least two distinct 
elements, is assumed without further mention; and in Postulates 4.3—-4.6 it 
is assumed that the indicated combinations are elements of K. Also, Postu- 
lates A, B, C, D are assumed without further mention.) 


PostTuLaTE 4.1. If a and b are in the class K, then a+b is in the class K. 
PostutateE 4.2. If a is in the class K, then a’ is in the class K. 
PostTuLATE 4.3. a+b=b+a. 
PostuLaTE 4.4. (a+b)+c=a+(b+c). 
POSTULATE 4.5. a+a=a. 
PosTuLaTE 4.6. (a’+6’)’+(a’+b)’ =a. 
By aid of the usual definition of ab (or ab), namely: 
4.7. Definition. ab = (a’+0’)’, 

the last postulate can be thrown into the following more familiar form: 
4.8. ab+ab’ =a. 


1933] POSTULATES FOR THE ALGEBRA OF LOGIC 281 


From 4.6, by 4.3, we have (a’+6)’+(a’+b’)’=a, whence by 4.2, 
=a. But by 4.7, (a’+’)’ =ab and (a’+6’’)’ =ab’. Hence 
ab+ab’=a. Conversely, from 4.7 and 4.8 we have (a’+6’)’+(a’+b’’)’ =a, 
whence by 4.10, below, (a’+6’)’+(a’+b)’ =a. 

The consistency of these postulates is established by the existence of any 
system (K, +, ’) which satisfies them all, as, for example, any one of the 
examples of Boolean algebra mentioned above. 

To establish the equivalence of this fourth set (which is expressed in terms 
of K, +, ’) and the first set of 1904 (which is expressed in terms of K, +, X), 
we must show (1) that all the postulates of the fourth set are deducible from 
the postulates of the first set, when a’ is properly defined in terms of + and 
x; and (2) that all the postulates of the first set are deducible from the 
postulates of the fourth set, when aXb is properly defined in terms of 
+ and ’, 

The first part of the proof is immediately evident from the preceding 
section. 

The second part of the proof is provided by the following theorems which 
are deduced from Postulates 4.1-4.6, with the aid of the definition of axbd 
contained in 4.7. 

4.9. a+a’=a'+a"’. 

By 4.6, [a]+[a’]= 
and [a’]+ [a’’] Hence 
by 4.3 and 4.4, a+a’=a’+a’’. 

Alternative proof, using 4.7 and 4.8 in place of 4.6: By 4.8, a+a’= 
and a’+a”’ Hence by 
4.3 and 4.4 (since by 4.3 and 4.7, ab =ba), we have a+a’=a’+a"’. 

4.10. a’’ =a. 

By 4.6, (a! =a" and (a’ =a. But 
by 4.9, a’+a’’=a’’+a’’’. Hence by 4.3, a’’ =a. 

Alternative proof, using 4.7 and 4.8 in place of 4.6: By 4.8, a’’a+a’’a’ 
=a’’ and aa’+aa’’ =a. Hence by 4.7 and 4.3, (a’+a’’’)’+(a’"’+a’"’)’ =a” 
and (a’+a’’)’+(a’+a’’’)’ =a. But by 4.9, a’+a’’ =a’’+a’"’. Hence by 4.3, 


4.11. a+a’=b+0’. 
Let «=a+a’ and y=b+b’. Then by 4.3, 4.6, 4.5, 4.4, 4.9, y=b’+b= 
b’ + [(b’+b’)’ +(b’ +b)’ ] =b’ + = +y’ =(b+b’) +y’ =y+y’. 
But by 4.6, with 4.3 and 4.4, 
y + y’ [(y’ + a’)! + (y’ + x’)’] + [(y” + xl")! + + x’)’] 
[(x”” + + + y’)’] + [(x’ + + (x’ + y’)’]. 


a’’ =a. 


282 E. V. HUNTINGTON [January 


Hence, by 4.6, y=x’+~. 
Again, by 4.3, 4.6, 4.5, 4.4, 4.9, 


Hence, by 4.3, x=y. 

4.12. Definition. v=a+a’=the “universe” element of the system. 

This element v exists, by 4.1 and 4.2, and is unique, by 4.11. Moreover, 
by 4.3, v=a’+a. 

4.13. Definition. z= (a+a’)’ =the “zero” element of the system. 

This element z exists, by 4.1 and 4.2, and is unique, by 4.11. Obviously, 
z=v’, where u is the universe element of 4.12; and by 4.10, z’ =v. 

4.14. If a’ =b’, then a=b. (By 4.2 and 4.10.) 

4.15. z+a=da. 

By 4.6, (a’+a’)’+(a’+a)’ =a. Hence by 4.5 and 4.3, (a’)’+(a+a’)’ =a. 
Hence by 4.10 and 4.12, a+u’ =a, whence by 4.13 and 4.3, z+a=a. 

4.16. avu=da. 

By 4.7, 4.3, 4.13, 4.15, 4.10, av =(a’+u’)’ =(u’+a’)’ =(z+a’)’ =(a’)’ =a. 

4.17. aa’ =z. 

By 4.7, 4.12, 4.13, aa’ =(a’+a’’)’ =u’ =z. 

4.18. ab=ba. (By 4.7 and 4.3.) 

4.19. (ab)c =a(bc). 

By 4.7 and 4.10, (ab)c =(a’ +b’)’c = [(a’+b’)'’ +c’ ]' = [(a’ +0’) +c’]’ and 
a(bc) =a(b’ +c’)’ = [a’+(b’+c’)’’ |’ = [a’+(b’+c’)]’. But these two values 
are equal by 4.4. 

4.20. a+b=(a'b’)’. 

By 4.7 and 4.10, a’b’=(a’’+b’’)’=(a+b)’. Hence by 4.10, (a’b’)’ 
=(a+b)"’ =a+0b. 

4.21. aa=da. 

By 4.7, 4.5, 4.10, aa=(a’+a’)’=(a’)’ =a. 

4.22. a+u=uU, 

By 4.12, 4.4, 4.5, 4.12, a+u =a+(a+a’) =(a+a)+a’=a+a’' =v, 

4.23. az=z. 

By 4.7, 4.13, 4.22, az=(a’+z’)’=(a’+u)’ =u’ =z. 

4.24. a+ab=a. 

By 4.8, ab+ab’ =a. Hence by 4.3, 4.4, 4.5, a+ab =ab+a=ab+(ab+ab’) 
=(ab+ab)+ab’ =ab+ab’ =a. 

4.25. a(a+b) =a. 

By 4.7, 4.20, 4.10, 4.24, 4.10, a(a+6) = [a’+(a+6)’]’ = [a’+(a’b’)’’]’ 
[a’+a'b’]’ = [a’]’ =a. 


1933] POSTULATES FOR THE ALGEBRA OF LOGIC 


4.26. If a’+b=v and b'+a=u, then a=b. 

By 4.15 and 4.3, a+u’=a. By 4.6, (a’+0’)’+(a’+b)’=a. Hence if 
a’+b=v, (a’+b’)’=a. By 4.6, (b’+a’)’+(b’+a)’=b. Hence if b’+a=v, 
(b’+a’)’ =b. Hence by 4.3, 

4.27. If a+b=vu and ab =z, then a’ =b. 

From a+b=v, by 4.10, a’’+b=v. From ab=z, by 4.7, (a’+6’)’=z, 
whence by 4.10, 4.13, 4.3, b’+a’ =v. Hence by 4.26, a’ =b. 

In the following theorems, parentheses are omitted, in view of the asso- 
ciative laws, 4.4 and 4.19, and references to these laws, and to the commuta- 
tive laws, 4.3 and 4.18, will often be understood. 

4.28. abc +abc’ +ab’c+ab'c' +a’be+a'be' +a'b’c+a'b'c' =v. 

By 4.8, the given sum =ab+ab’+a’'b+a’b’=a+a’, and by 4.12, a+a’ 
=U. 
4.29. If A and B are any two distinct terms of the sum in 4.28, then AB =z, 
For example, (ab’c)(a’bc) = (aa’)(b’cbc) = z(b’cbc) =z by 4.17 and 4.23. 

4.30. ab+ac=abc+abc’+ab'c. 

By 4.8, ab=abc+abc’ and ac=abc+ab’c. Hence by 4.5, ab+ac=abc 
+abc’+ab’c. 

4.31. [a(b+c) 

By 4.7, 4.10, [a(6+c) ]’=a’+(b+c)’=a'+b'c’. But by 4.8 a’ =a’b+a’b’ 
=a’be+a'bc’+a’'b’c+a’'b’c’, and by 4.18, 4.8, b’c’ =ab’c’+<a’b’c’. Hence the 
theorem, by 4.5. 

4.32. (ab+ac)+[a(b+c) ]’=v. (From 4.30, 4.31, by 4.28.) 

4.33. (ab+ac)[a(b+c) |’ =z. 

Let A, B, C, D, E, F, G, H be the eight terms in 4.28. By 4.30, ab+<ac 
=A+B+C, and by 4.31, [a(b+c)]’=D+E£+F+4+G+H. By 4.29, 4.5, 
AD+AE=z+z=z. Hence by 4.32, 4.15, [A4(D+£) ]’=v, whence by 4.10, 
4.13, A(D+E) =z. Similarly, A(D+£)+AF =z+z=z, whence A(D+£E+F) 
=z. And so on; so that A(D+E+F+G+4H) =z. By similar reasoning, we 
find (A+B+C)(D+E+F+G+4H) =z, which proves the theorem. 

4.34. a(b+c) =ab+ac. (First form of the distributive law.) 

From 4.32 and 4.33, by 4.27, (ab+ac)’=[a(b+c)]|’. Hence by 4.14, 
ab+ac=a(b+c). 

4.35. a+bc =(a+b)(a+c). (Second form of the distributive law.) 

By 4.10, 4.7, a+bc=(a’)’+(b+c’)’=[a’(b’+c’)]’, whence by 4.34, 
But also (a+b)(a+c) =[(a+b)’+(a+c)’]’=[a’b’ 
+a’c’]’. Hence a+bc =(a+6)(a+c). 

These propositions include all the postulates of the first set of 1904 (see 
4.1, 4.7, 4.15, 4.16, 4.3, 4.18, 4.35, 4.34, 4.12, 4.17); so that any system (K, 
+, ’) which satisfies Postulates 4.1-4.6 will have all the properties of a 


283 


284 E. V. HUNTINGTON [January 


Boolean algebra, if the logical product, aXb, is defined in terms of + and 
’ as in 4.7. 

To prepare the way for the definition of the relation a<b, we prove the 
following theorems. 

4.36. If a+b=b, then ab =a; and conversely, if ab=a, then a+b=b. 

If a+b=b, then by 4.7, 4.20, 4.24, 4.10, 


ab = (a’ + b’)’ = [a’ + (a2 + = [a’ + = =. 
If ab =a, then by 4.20, 4.7, 4.10, 4.24, 
a + b = (a’b’)’ = [(ab)'b’)’ = [(ab)” + b”]” = ab +b = 5. 


4.37. Ifa+b=b, then a’ +b =v; and conversely, if a’+b=v, thena+b=b. 

If a+b=b, then by 4.7 and 4.22, a’+b=a'+(a+5) =(a’+a)+b=u+b 
=v. If a’+b=v, then by 4.20, 4.15, 4.17, 4.34, 4.7, 4.10, 4.15, a+b =(a’b’)’ 
= [a’b’+z]’ = [a’b’+5b’ |’ = [(a’+5)b’ |’ =(a’ +b)’ 

4.38. If a+b=b, then ab’ =z; and conversely, if ab’ =z, then a+b=b. 

If a+b=5, then by 4.7, 4.12, 4.22, 


ab! = (a’ + b)' = [a’ + (a + = +a) +8)’ = +0)’ = =z. 


If ab’ =z, then by 4.20, 4.10, a’+=(ab’)’ =z’ =v, whence by 4.37, a+b=b. 

4.39. Definition. If a+) =5; or if ab =a; or if a’+b=v; or if ab’ =z; then 
and only then we write a<b, 

The equivalence of these four forms of the definition follows from 4.36, 
4.37, 4.38. 

The following theorems are added because of their connection with the 
fifth and sixth sets, below. 

4.40. a+(b+c)’ = [(b’+a)’+(c’+a)’]’. 

By 4.7, 4.10, 4.34, 4.21, 4.24, 4.7, [(b’+a)’+(c’+a)’]’ =(b’+a)(c’+a) 
=b'c’+ab’+ac’+aa= [a+a(b’+c’)]+0’c’ = [a] +b’c’ =a+(b+c)’. 

4.41. (b’+c)’+ [(a+b)’+a+c] =v. 

By 4.7, 4.20, 4.16, 4.12, 4.34, 4.35, 


(b’ +c)’ = be’ = ube’ = (a+ a’)be’ = abe’ + abc’; 
=a =a bu =a 
+ (64+ = ab + ab’ + bc + d'c 
(abc + abc’) + (ab’c + ab’c’) + (abc + a’bc) + (ab’c + a’b’c). 


Hence the theorem, by 4.28 (with 4.5). 
4.42. (a+a)’+a=v. (By 4.5, 4.12.) 
4.43. b’+(a+b) =v. 


1933] POSTULATES FOR THE ALGEBRA OF LOGIC 


By 4.3, 4.4, 4.12, 4.22, b’+(a+6) =a+(b+6’) =a+v=v. 
4.44. (a+b)’+(b+a) =v. (By 4.3, 4.12.) 


INDEPENDENCE PROOFS FOR THE FOURTH SET 


The independence of the postulates of the fourth set is established by the 
existence of the following examples of systems (K, +, ’), each of which 
violates the like-numbered postulate, and satisfies all the other postulates of 
the set. 

Example 4.1. K =two elements, 0 and 1, with + and ’ defined as follows: 
0+0=0, 1+1=1, 0+1=-2, 1+0=2; 0’ =1, 1’=0. Here x is not an element 
of the class K, so that Postulate 4.1 fails. The other postulates are satisfied 
whenever the indicated combinations are elements of the class. 

Example 4.2. K =two elements, 0 and 1, with a+ and a’ defined as in 
the accompanying table. Here x is not in the class K, so that Postulate 4.2 
fails. The other postulates are satisfied whenever the indicated combinations 
are elements of the class. 

0 1] ’ 


0 
1 


1 || x 
Example 4.3. In this system, Postulate 4.3 fails, since 5+2=5 and 2+5 
=2. The other postulates will be found to be satisfied. 
0 1 2 4 


. 


1 
1 
1 
1 
1 
1 


Example 4.4. Here 4.4 fails, since (2+1)+3=2+3=1 while 2+(1+3) 
=2+0=2. The other postulates are satisfied. 


1 2 3 
1 2 


0 || 0 
8.2 
22 2 2 
313 0 1 


285 
45 
P1111 
211 2 
3/3 $1331 
p14 41 
sis 

0 || 
3 2 


286 E. V. HUNTINGTON [January 


Example 4.5. Here 4.5 fails since 2+2=1. The other postulates are satis- 
fied. 


0 


2 


Example 4.6. Here 4.6 fails since (3’+5’)’+(3’+5)’ =(2+4)’+(2+5)’ 
=1’+1’=0+0=0#3. The other postulates are found to be satisfied. 


0123 4 


. 


THE FIFTH SET 


The following set of independent postulates for Boolean algebra is directly 
suggested by H. M. Sheffer’s postulates of 1913, when these are expressed 
in terms of (K, +, ’). 

This fifth set contains one fewer postulate than the fourth set; but the 
fifth set as a whole seems less simple and natural than the fourth set. 

(A trivial preliminary postulate 5.0 demanding that the class contain at 
least two elements is assumed without further mention, and in Postulates 
5.3-5.5 it is assumed that the indicated combinations are elements of K. Also, 
Postulates A, B, C, D are assumed without further mention.) 


PostuLaTE 5.1. If a and b are in the class K, then a+b is in the class K. 
PostuLaTeE 5.2. If a is in the class K, then a’ is in the class K. 
PostTuLaTE 5.3. (a’)’ =a. 

PostuLaTE 5.4. a+(b+0’)’ =a. 

PostuLaTE 5.5. a+(b+c)’ = [(b’+a)’+(c’+a)’]’. 

The consistency of these five postulates is shown by any example of a 
Boolean algebra (K, +, ’), like the regions within a square, with + and ’ 
defined in the usual way. 

The equivalence between the fifth set and the earlier sets is established as 


follows. 
5.6. a+a=a. 


1 2] 
11/5 


1933] POSTULATES FOR THE ALGEBRA OF LOGIC 287 


By 5.5 and 5.3, b’+(b+6’)’ = whence, applying 
5.4 to each side, b’ = [(b’+’)’]’. Hence by 5.3, b’=b’+0’. Hence a’’=a"”’ 
+a’’, whence by 5.3, a=a+a. 

5.7. a+b=b-+a. 

By 5.3, 5.6, 5.5, 5.6, 5.3, 


a + b=a + (b’)’ =a + + b’)’ + a)’ + (b” + a)’|’ 
= + a)’]’ =b+a. 


5.8. (a’+b’)’+(a’+b)’ =a. 
By 5.4, 5.5, 5.3, 5.7, 
a’ a’ + (b + b’)’ + a’)’ + (b” + a’)'|’ + a’)’ (b + a’)'|’ 
[(a’ + b’)’ + (a’ + 
Hence by 5.3, a=(a’+b’)'’+(a’+b)’. 
The following theorems lead up to the associative law, 5.27. 
5.9. ata’=b+)’. 
By 5.4, 5.7, 5.4, 


(a+a’)’ = (a+a’)’+ (64+ 0)’ = (64+0')’+ = 


Hence by 5.3, a+a’ =b+0’. 

5.10. Definition. v=a+a’ =the “universe” element of the system. 

This element vu exists, by 5.1 and 5.2, and is unique by 5.9. Moreover, by 
5.7, v=a'+a. 

5.11. Definition. z=(a+a’)’=the “zero” element of the system. 

This element z exists, by 5.1 and 5.2, and is unique by 5.9. Obviously, 
z=v’, where v=the universe element of 5.10; and by 5.3, z’ =v. 

5.12. a+z=a. 

By 5.11 and 5.4, a+z=a+(a+a’)’=a. 

5.13. a+u=uv. 

By 5.10, 5.12, 5.7, 5.5, 5.3, 5.11, 5.12, 5.3, v=a+a’=a+(a+z)’= 
a+(u’+a)’= =(a+v)” 
=a+u. 

5.14. Definition. ab = (a’+b’)’. 

By 5.1 and 5.2, if a and 0 are in the class K, then abd is in the class K. 

5.15. aa=a. 

By 5.14, 5.6, 5.3, aa=(a’+a’)’ =(a’)’ =a. 

5.16. ab=ba. (By 5.14 and 5.7.) 

5.17. aa’ =z. 

By 5.14, 5.10, 5.11, aa’ =(a’+a)’=v' =z. 

5.18. az=2Z. 


4 
} 
he 
ile 
3 
i 
i 
Ne 
ne 


288 E. V. HUNTINGTON [January 


By 5.14, 5.11, 5.13, 5.11, az=(a’+2z’)’ =(a’+v)'=v' =z. 

5.19. avu=da. 

By 5.14, 5.11, 5.12, 5.3, av=(a’+v’)’ =(a’+z)’=(a’)’ =a. 

5.20. a+b=(a’b’)’. (By 5.14 and 5.3.) 

5.21. a+bc=(a+b)(a+c). 

By 5.14, 5.5 and 5.3, 5.14, 5.7, a+bc=a+(b’+c’)’ = [(6+a)’+(c+a)’]’ 
=(b+a)(c+a) =(a+b)(a+c). 

5.22. a(b+c) =ab+ac. 

By 5.14, 5.14 and 5.3, 5.21, 5.14 and 5.3, 5.14, we have a(b+c)= 
[a’+(b+c)’]’ =[a’+b’c’]’ =[(a’+b’)(a’ +c’)]’ =(a’ +b’)'+(a’+c’)’ =ab+ac. 

5.23. a+ab=a. 

By 5.19, 5.22, 5.13 and 5.7, 5.19, a+ab=au+ab=a(u+b) =av =a. 

5.24. a(a+bd) =a. 

By 5.22, 5.15, 5.23, a(a+b) =aa+ab=a+ab=a. 

5.25. If a’+b=vu and b'+a=vu, thena=b, 

By 5.11 and 5.12, a+v’=a. By 5.8, (a’+b’)’+(a’+b)’=a. Hence if 
a’+b=v, then by 5.12, a’+b’=a. By 5.8, (b’+a’)’+(6’+a)’=b. Hence if 
b’+a=v, then by 5.12, b’+a’ =b. Hence by 5.7, a=). 

5.26. If a+b=v and ab =z, then a’ =b. 

From a+b=u, by 5.3, a’’+b=v. From ab=z, by 5.14, (a’+6’)’ =z, 
whence by 5.3, 5.7, 5.1, b’+a’ =v. Hence by 5.25, a’ =b. 

5.27. (a+b) +c=a+(b+c). 

We prove first the following lemmas. 


Lema (1). If x=(a+b) +c, then ax =a, bx =b, and cx =c. 
For, by 5.7, 5.22, 5.24, 5.23, 


ax = al(a+ 6) +c] = a(a+b) +ac=a+ac=a, 


whence similarly, bx =); and by 5.7, 5.22, 5.15, 5.19, 5.22, 5.13, 5.19, 
cx =c[(a+b)+c] =c(a+b) +cc =c(a+b) =cv 


=C. 


Lemna (2). If x=(a+b)+c, then x’a =z, x'b =z, and x'c=z. 

For, by 5.19 and 5.7, 5.10, 5.21, Lemma (1), 5.10, x+a’=v(a’+x) 
=(a’+a)(a’+x) =a’+ax=a'+a=v, whence by 5.14, 5.11, x’a=(x+a’)’ 
=v’ =z. Similarly, x’b =z and x’c =z. 

Lemma (3). If y=a+(b+c), then y+a’ =v, y+b’ =u, and y+c’ =v. 

For, by Lemma (2), with 5.7, y’a=z, y’b=z, y’c=z, whence, by 5.20, 
5.11, y+a’ =(y’a)’ =z’ =v. Similarly y+b’ =v and y+c’ =v. 


1933] POSTULATES FOR THE ALGEBRA OF LOGIC 289 


The proof of the main theorem then proceeds as follows. By 5.22, 5.22, 
5.6, «’y=x' [a+(b+c)] =2’a+a'(b+c) =2'a+(x'b+x'c) =24+(z2+z) =z+2 
=z; and by 5.7, 5.20 and 5.3, 5.21, 5.21, 5.19, 


[e+ = 94+ = 9+ 
= (yt+e¢) [y+ = (9+ +0) (y = v(vv) = = 


Hence by 5.26, x =y; that is, (a+b)+c=a+(b+c). 

We can now establish the equivalence of the fifth set and the fourth set, 
as follows: 

Theorems 5.1, 5.2, 5.6, 5.7, 5.8, and 5.27 show that all the postulates of 
the fourth set are deducible from Postulates 5.1-5.5; and conversely all the 
postulates of the fifth set are readily deducible from Postulates 4.1-4.6. 

Incidentally, Theorems 5.1, 5.14, 5.12, 5.19, 5.7, 5.16, 5.21, 5.22, 5.10 and 
5.17 show directly that all the postulates of the “first set” are deducible from 
Postulates 5.1-5.5 (when the product abd is defined as in 5.14); and conversely, 
all the postulates of the fifth set are readily deducible from Postulates Ia—VI 
(when a’ is defined in terms of + and X as in V); so that the equivalence be- 
tween the fifth set and the “first set” is established directly, without reference 
to the fourth set. 

To show that the fifth set of postulates is equivalent to Sheffer’s set of 
1913, which occupies so important a position in the revised edition of the 
Principia Mathematica (volume 1, 1925), we need the definition of Sheffer’s 
“stroke” function, namely: 

5.28. Definition. a|b=(a+b)’=the “reject” of a and 6 (pronounced a 
per 5). 

On the basis of this definition we deduce Sheffer’s postulates from Postu- 
lates 5.1-5.5 as follows: 

(1) There are at least two distinct K-elements. 

(2) Whenever a and b are K-elements, a |b is a K-element. (By 5.1, 5.2.) 

Definition. a’ =a |a. (By 5.6.) 

(3) (a’)’=a. (By 5.3.) 

(4) a|(b|b’) =a’. (By 5.4 and 5.3.) 

(5) [a|(b|c)}’ =(0' |a) {(c’ |a). (By 5.5 and 5.3.)* 


* In a paper published in 1916, B. A. Bernstein showed that Sheffer’s postulates (3), (4), and 
(5) may be replaced by two postulates, P; and Py: 
Ps. a)| (b’| a) =a. 
Py. a’| (’|c)=[(| ]’. 
This change does not lead to any corresponding reduction in Postulates 5.1-5.5. 


if 
| 
ii 
na 


E. V. HUNTINGTON [January 


INDEPENDENCE PROOFS FOR THE FIFTH SET 


The independence of the postulates of the fifth set is established by the 
existence of the following examples of systems (K, +, ’), each of which 
violates the like-numbered postulate, and satisfies all the other postulates of 
the set. 


Example 5.1. K =two elements, 0, 1, with a+) and a’ defined as in the 
accompanying table, where x is any object not an element of the class K. 
ij} 1 0 
Example 5.2. K =two elements, 0, 1, with a+) and a’ given by the table 
(x being any object not an element of the class K). 
+/0 ’ 
0/0 
1] 1 


Example 5.3. K =six elements, with a+) and a’ given by the table. 
0 2 5 


0 
1 
2 
3 
4 
5 


1 


Here Postulate 5.3 fails, since (2’)’=3’=4. All the other postulates of 
the fifth set are found to be satisfied. 

It is interesting to note that while the commutative law, a+b=5+a, 
does not hold in this example, it is always true that a+b=(b+<a)”. 


Example 5.4. K =three elements, 0, 1, 2, with a+b and a’ given by the 
table. 


+ 
1 0 
2 2 
Here Postulate 5.4 fails, since 0+(2+2’)’ =2. 


| 290 
0145 2 1 
2.3 
2414803 2 3823 
31101 344 
414101145 


1933] POSTULATES FOR THE ALGEBRA OF LOGIC 291 


Example 5.5. K =three elements, 0, 1, 2, with a+6 and a’ given by the 
table. 


0 
2 
1 


0 
0 
1 
2 


1 2 
21 
1 0 
0 2 


To show that Postulate 5.5 fails, take a=1, b=1,c=2. 
THE SIXTH SET 


The following set of postulates for Boolean algebra in terms of (K, +, ’) 
is suggested by B. A. Bernstein’s version of the primitive propositions of the 
Principia. The only modifications are as follows: (a) his proposition 1.5 is 
omitted because it can be proved as a theorem; (b) his notation “ =1” is here 
replaced by the notation “is in a subclass T”, which corresponds more nearly 
to the Frege assertion sign, +; and (c) our Postulate 6.8 is an additional 
postulate, not included among the primitive propositions of the Principia. 

A trivial preliminary postulate, 6.0, demanding that the class K contain at 
least two distinct elements, is assumed without further mention; and in 
Postulates 6.4-6.8 the indicated combinations are assumed to be elements of 
K. Also, Postulates A, B, C, D are assumed without further mention. The num- 
bers in brackets indicate the corresponding postulates in the Bernstein- 
Principia list. 

PostuLatTeE 6.1. [1.71.] If a@ and b are in the class K, then a+b is in the 
class K. 

PostutaTeE 6.2. [1.7.] If a is in the class K, then a’ is in the class K. 

There exists in the class K a subclass T having the following five proper- 
ties: 

PostutatTE 6.3. [1.1.] If ais in T and a’ +b is in T, then b is in T. 

PostuLateE 6.4. [1.2.] If a is in K, then (a+a)'+<a is in T. 

PostutaTE 6.5. [1.3.] If a and b are in K, then b’+(a+b) is in T. 

PostutatTeE 6.6. [1.4.] If a and b are in K, then (a+b)'+(b+a) is in T. 

PostutaTE 6.7. [1.6.] If a, 6, ¢ are in K, then (b’+c)'’+[(a+b)’+(a+c) ] 
is in T. 

PostuLaTE 6.8. If T is a subclass having the five properties just mentioned, 
then we have: If a' +-b is in T, and b’+a is in T, then a=b. 


The consistency of these postulates is established by the existence of any 
Boolean algebra (K, +, ’), with the subclass T taken as the class containing 
the single element v. 


0 ; 

2 

} 

is 

| 


292 E. V. HUNTINGTON [January 


The equivalence of the sixth set and the fourth set is established as follows. 

In the first place, all the postulates of the sixth set are readily deducible 
from the fourth set; the single element v (see 4.12) constitutes the required 
subclass T. 

We proceed to show, conversely, that all the postulates of the fourth set 
can be derived as theorems from the sixth set. 

6.9. If a’+bis in T and b’ +c is in T, then a’ +c is in T. 

By 6.7, (b’+c)’+[(a’+b)’+(a’+c)] is in T. But by hypothesis, b’+<¢ 
is in T. Hence by 6.3, (a’+)’+(a’+c) is in T. But by hypothesis, a’+5 is 
in T. Hence by 6.3, a’+c is in T. 

Note. This theorem 6.9 corresponds to the “syllogism” in the theory of 
deduction, while 6.3 corresponds to the “rule of inference.” 

By the aid of this theorem we can establish at once the redundancy of 
proposition 1.5 in the Bernstein-Principia list. This theorem 1.5 will serve 
as a lemma in the proof of the associative law (6.12). 

6.10. [1.5.] [a+(b+c) ]’+[b+(a+c)] is in T. 

(The following proof is adapted from a proof given, in another notation, 
by P. Bernays in 1926. It does not involve Postulate 6.8.) 

By 6.5, c’+(a+c) isin T. 

By 6.7, [c’ +(a+c) ]’+ {(b+c)’+ [6+ (a+c)]} isin T. 

Hence by 6.3, (b+c)’+ [b+(a+o) ] is in T. 

By 6.7, {(b+e)'+ {a+ [6+(a+c)]}) is in 


, Hence by 6.3, [a+(b+c) ]’+ {a+ [b+(a+c)]} is in T. 
By 6.6, {a+ [b+(a+c)]}’+ { [b+(a+c)]+<e} isin T. 
Hence by 6.9, 


[a+ + +} isin T. 


By 6.5, a’+(c+a) is in T. 

By 6.6, (c+a)’+(a+c) is in T. 

Hence by 6.9, a’+(a+c) isin T. 

By 6.5, (a+c)’+ [b+(a+c)] is in T. 

Hence by 6.9, a’+ [b+(a+c) | is in T. 

By 6.7, {a’+ [b+(a+c)]}’+({ [6+(e+e)]+e}’+ { [6+(¢+0)]+ 
+c)]}) isin T. 

Hence by 6.3, { [o+(a+c)]+a}’+{ [b+(a+c)]+[b+(a+c)]} isin T. 

By 6.4, { [6+(a+c)]+[b+(a+c)]}’+[b+(a+c)] is in T. 

Hence by 6.9, 


(2) {[o+ (a+c)] + a}’ + [6+ (2 isin T. 


(1) 


1933] POSTULATES FOR THE ALGEBRA OF LOGIC 293 


From (1) and (2), by 6.9, [a+(6+c)]’+ [6+(a+c)] is in T. 

The next three theorems correspond to Postulates 4.3, 4.4, 4.5 of the 
fourth set. 

6.11. a+b=b+a. 

By 6.6, (a+b)’+(b+<a) is in T. Again, by 6.6, (6+a)’+(a+0) is in T. 
Hence by 6.8, a+b=b+<a. 

6.12. (a+b) +c=a+(b+c). 

By 6.10, [a+(c+6) ]’+ [c+(a+6)] isin T. Hence by 6.11, 


(1) la+ (6+) + +c] isin T. 
From (1), [¢+(6+a)]’+[(¢+)+<a] is in T. Hence by 6.11, 
(2) [(a+ 6) + [a+ isin T. 


From (1) and (2), by 6.8, a+(b+c) =(a+b)+¢e. 
6.13. a+a=a. 
By 6.4, (a+a)’+a is in T. By 6.5, a’ +(a+a) is in T. Hence by 6.8, a+a 


=a. 
The following theorems establish the existence and properties of the uni- 
verse element v. 
6.14. If ais in K, then a+a’ is in T. 
By 6.7 [(a+a)’+a]’+{ [a’+(a+a)]’+(e’+a)} is in T. But by 6.4, 


(a+a)’+a is in T. Hence by 6.3, [a’+(a+a)]’+(a’+a) is in T. But by 
6.5, a’+(a+<a) is in T. Hence by 6.3, a’+a is in T. Hence by 6.11, a+a’ is 
in T. 

6.15. a+a’=b+0’. 

By 6.7, [(a+b)’+(b+a) ]’+ { ]’+ [b’+(6+<a)]} is in T. But 
by 6.6, (a+6)’+(b+a) is in T. Hence by 6.3, [b’+(a+5) ]’+[b’+(b+<a) ] 
is in T. But by 6.5, b’+(a+0) is in T. Hence by 6.3, 

(1) b’ + (6 + a) isin T. 

From (1), (e+a’)’+ [(a+a’)+(6+0’)’] is in T. But by 6.14, a+a’ is in T. 
Hence by 6.3, 

(2) (a+ a’) + (6+ 0’) isin T. 

Now by 6.6, [(a+a’)+(b+b’)’]’+ [(6+0’)’+(a+a’)] is in T. Hence by (2) 
and 6.3, 

(3) (b+ b’)’ + (a+ a’) isin T. 

From (3), 


(4) (a+ a’)’ + (06+ b’) isin T. 


9 
| 
ff 
q 
3 
i 
it 
| 
an 
44 
na 
hy 


294 E. V. HUNTINGTON [January 


From (4) and (3), by 6.8, +6’ =a+a’. 
This theorem justifies the following definition. 
6.16. Definition. v=a+a’=the “universe” element of the system. 
This element v exists, by 6.1 and 6.2, and is unique, by 6.15. Also, by 
6.11, v=a’ +a; and obviously, by 6.14, visin T. 
6.17. If ais in T, thena=v. 
By 6.5, 


(1) a’ +(a’+a)isinT. 


Again, by 6.5, a’+ [(a’+a)’+a] is in T. But by hypothesis, a is in T. Hence 
by 6.3, 


(2) (a’ + a)’ + ais in T. 
From (1) and (2), by 6.8, a=a’+a. Hence by 6.16, a=v. 


This theorem shows that the subclass T consists of a single element of the 
class K, namely, the element u defined in 6.16. 


6.18. a+u=u (whence, by 6.11, v+a=v). 

By 6.5, v’+(a+v) is in T. But by 6.16, v is in T. Hence by 6.3, a+ is 
in T. Hence by 6.17, a+u=v. 

6.19. v’+a=a (whence, by 6.11, a+u’ =a). 

By 6.7, (v’+a)’+[(a+v)’+(a+a)] is in T. Hence by 6.18 and 6.13, 
(u’+a)’+(u’+<a) isin T. Hence by 6.11 and 6.12, v’+[(v’+a)’+a] is in T. 
Hence by 6.3, (uv’+a)’+aisin T. But by 6.5, a’+(u’+a) is in T. Hence by 
6.8, u’+a=a. 

6.20. 

Suppose v’=v. Then by 6.19 we would have a+uv=a. But by 6.18, 
a+vu=v. Hence we would have a=v; that is, every element a would be 
equal to the element v, so that the class K would contain only a single ele- 
ment. This trivial case is excluded by Postulate 6.0. 

6.21. a’ =a is always false. 

Suppose there existed an element ¢ such that c’ =c. Then by 6.13 and 6.16, 
c=c+c=c+c’=v. Hence c’=v’. Hence, if c’=c we would have v’=u, 
which by 6.20 is impossible. 

6.22. If ais in T, then a’ is not in T. 

Since a is in T, by 6.17, a=v, whence by 6.2, a’ =v’. If a’ also were in T, 
then, by 6.17, a’ =v, whence uv’ =v, which by 6.20 is impossible. 

6.23. a’’ =a. 

By 6.7, (a’’+a’’’)’+ [(a+a’)’+(a+a’’’)] is in T. But by 6.14, a’’+a’”’ 
is in T. Hence by 6.3, (a+a’)’+(a+a’”’) is in T. But by 6.14, a+a’ is in T. 


1933] POSTULATES FOR THE ALGEBRA OF LOGIC 


Hence by 6.3, a+a’”’ is in T, whence by 6.11, 
(1) + aisinT. 
Again, by 6.14, 

(2) a’ +a” isin T. 


From (1) and (2), by 6.8, a’’ =a. 

6.24. Definition. ab =(a’+5’)’. 

6.25. ab=ba. (By 6.24 and 6.11.) 

6.26. (ab)c=a(bc). (By 6.24, 6.23, 6.12.) 

6.27. aa=a. (By 6.24, 6.23, 6.13.) 

6.28. au=a. 

By 6.24, 6.11, 6.19, 6.23, av=(a’+0’)’=(u’+a’)’ =(a’)’ =a. 

6.29. Definition. z= uv’ =the “zero” element of the system. 

6.30. a+z=a. (By 6.29 and 6.19.) 

6.31. az=z. 

By 6.24, 6.29, 6.23, 6.18, az=(a’+z’)’=(a’+u)’=0' =z. 

6.32. aa’ =z. 

By 6.24, 6.16, 6.29, aa’ =(a’+a’’)’ =v’ =z. 

6.33. (ab)’=a'+b’. (By 6.24, 6.23.) 

6.34. (a+b)’=a'b’. (By 6.24, 6.23.) 

In the following proofs, tacit use will be made of 6.17. 

6.35. If a’+b=u, then ab=a. 

By 6.7, [b’+(a’+ab) ]’+ {(a’+b)'+ [a’+(e’+ab)]} =v. But by 6.11, 
6.12, 6.24, 6.16, b’+(a’+ab) =(a’+b’)+ab =(a’+b’)+(a’+b’)’ =v. Hence 
by 6.3, (a’+b)’+ [a’+(a’+ab)]=v. But by hypothesis, a’-+b=v. Hence by 
6.3, a’+(a’+ab) =v, whence by 6.12 and 6.13, 


(1) a’ +ab= vu. 


Again, by 6.33, 6.11, 6.12, 6.16, (ab)’+a=(a’+b’)+a=b'+(a+a’) =b’+u, 
whence by 6.18, 


(2) (ab)’ +a=vu. 


From (1) and (2), by 6.8, a=ab. 
6.36. If ab’ =z, thena+b=b. 
From ab’ =z, by 6.33, 6.25, 6.29, b’’+a’ =(b’a)’ =(ab’)’ =z’ =v. Hence 
by 6.35, b’a’ =b’, whence by 6.34, (b+a)’=b’. Hence by 6.11, 6.23, a+b=b. 
6.37. If ab’ =z, then ab=a. 
From ab’ =z, by 6.24, 6.23, 6.29, a’-++b =v. Hence by 6.35, ab=a. 
6.38. If ab=a, thena+b=b. 


295 
i} 


296 E. V. HUNTINGTON [January 


From ab=a, by 6.33, a’=a’+b’. Hence by 6.12, 6.13, 6.16, 6.18, (@’) +6 
=a’+u=v, whence by 6.25, 6.23, b’’+a’=v. 
Hence by 6.35, b’a’ =b’, whence by 6.34, (b+a)’=b’. Hence by 6.25, 6.23, 
a+b=b. 

6.39. a(ab+ab’)’ =z. 

Let x=a(ab+ab’)’. Then by 6.26, 6.25, 6.32, 6.31, 


(1) x(ab + ab’) = [a(ab + ab’)’ |(ab + ab’) = a[(ab + ab’)'(ab + ab’) | = az =z; 
(2) xa’ = [a(ab + ab’)']a’ = (ab + abd’)'(aa") = (ab + abd’)z =z. 


From (1), by 6.25, 6.23, (ab+ab’)x’’ =z, whence by 6.36, ab+ab’+2’ =x’. 
Hence by 6.11, 6.12, 6.13, ab+x’=ab+(ab+ab’+x’) =ab+ab’+zx’ =x’, 
whence by 6.24, 6.23, 


(3) (ab)’x = x; 
also, ab’ +2’ =ab’+(ab+ab’+x’) =ab+ab’+x’ =x’, whence by 6.24, 6.23, 
(4) (ab’)'’x = x. 


From (4), by 6.26, 6.25, 6.32, 6.31, 
(xa)b’ = x(ab’) = [(ab’)’x](ab’) = x[(ab’)(ad’)’| = xz = z, 


whence by 6.37, (xa)b =«xa. 
From (2), «a’ =z, whence by 6.37, xa =x. Hence (xa)b =x, whence by 6.26, 


(5) x(ab) = x. 
From (3) and (5), by 6.27, 6.26, 6.25, 6.32, 6.31, 
x = xx = [x(ab) [x(ab)’ ] =x[(ab)(ad)’] = xz = z. 


6.40. ab+ab’ =a. 
By 6.25, 6.26, 6.27, (ab)a=a(ab) =(aa)b=ab, whence by 6.38, 


(1) ab+a=a. 
From (1), 
(2) ab’ +a=a. 


Hence (ab+a)+(ab’+a) =a+a, whence by 6.25, 6.26, 6.27, a+(ab+ab’) =a. 
But by 6.39, a(ab+<ab’)’ =z, whence by 6.36, a+(ab+ab’) =ab+ab’. There- 
fore ab+ab’ =a. 

6.41. (a’+b’)’+(a’+b)’ =a. (From 6.40, by 6.24 and 6.23.) 

The proof of the equivalence of the sixth set and the fourth set is thus 
complete; Theorems 6.1, 6.2, 6.11, 6.12, 6.13, and 6.41 show that all the 
postulates of the fourth set are deducible from the sixth set. 


| 


1933] POSTULATES FOR THE ALGEBRA OF LOGIC 297 


INDEPENDENCE PROOFS FOR THE SIXTH SET 
We first give three examples for the independence of Postulate 6.8. 
Example 6.8 (1). K =three elements, 0, 1, 2; with a+0 and a’ given by 
the table. 


, 


elt 


0 
0 
1 
2 


0 
1 


| 


This system (K, +, ’) has all the properties called for by Postulates 
6.1-6.7, with the subclass T consisting of the single element 1. The system 
fails on 6.8, since 2’+0=1 and 0’+2=1, but not 2=0. It is interesting to 
note that a’’’+ain T and a’+a”’ in T both hold, but a’’ =a is not true when 
a=2. We note also that a+b=b+<a and (a+6)+c=a+(b+c) and a+a=a; 
further, if a+) is in T then a isin T or bis in T. 

Example 6.8 (2). K =five elements. (This example was suggested to me, 
in another connection, by Dr. K. E. Rosinger.) 


1 
1 
1 
1 


WN 
Wil 
orm 


Postulates 6.1-6.7 hold, with the subclass T consisting of the single element 
1. Postulate 6.8 fails, since 2’+3=1 and 3’+2=1, but not 2=3. Here 
a+b=b+a and (a+b)+c=a+(b+c) and a+a=a are always true, but not 
a’’ =a. Further, a+6 can equal 1 when neither a=1 nor b=1. 

Example 6.8 (3). K =six elements. (This example was suggested to me 
by Mr. P. Henle.) 


Or 


| 
0 
1 
2 
3 
4 
5 


any object not in the class K. 


Example 6.1 
1 2 3] 
oo 12 311 
111414140 
2112 12 «113 
3/13 1 


Mathematical Society, 1932). 


Example 6.3 Example 6.4 
- 
Oo} 1 00 1 2 
1 40 1 
2/2 1 1 
Example 6.6 
12 347 + 
12 341 0 
1/111 4140 1 
2 283 2 
3/011 342 3 
‘4 
5 


298 E. V. HUNTINGTON 


Example 6.2 

0 12 3 
i 
£88 
3 11 


[January 


Postulates 6.1—-6.7 will be found to hold, with the subclass T consisting 
of the single element 1. Postulate 6.8 fails, since 2’+5=1 and 5’+2=1, but 
not 2=5. We note that a’’=a holds. Also, (a+6)’+(b+a)=1, but not 
a+b=b-+a. Also, (a+a)’+a=1 and a’+(a+a)=1, but not a+a=a. 

Obvious examples for Postulate 6.1 and 6.2 are the following, in which x is 


The remaining examples (for 6.3-6.7) I take from a recent paper by 
P. Henle (The independence of the postulates of logic, Bulletin of the American 


Example 6.5 
1 2 347" 
01010 0/1 
21/0101] 3 
31011 0] 2 
Example 6.7 
02234 55° 
0123 4 
1111414140 
34844 
3113 4 5/2 
41144115 
51151 


The following unsolved problem may be noted. If 6.8 were replaced by 
the same postulate without the qualifying clause (call it 6.8a), then the 
independence of 6.3 would become an open question (since the present 
Example 6.3 does not satisfy 6.8a). 


— 


1933] POSTULATES FOR THE ALGEBRA OF LOGIC 299 


APPENDIX I 
THE CONNECTION BETWEEN BOOLEAN ALGEBRA AND THE PRINCIPIA 


In order to establish the connection between Boolean algebra and the 
system set forth in Section A of the Principia, we first quote the following 
propositions verbatim from the second edition of the Principia. 

*1.71. If pand gare elementary propositions, p v g is an elementary propo- 
sition. 

*1.7. If p is an elementary proposition, ~# is an elementary proposition. 

*4.31. kipvg-=-qvp. 

*4.33. +: (pvgq) vr-=-pv(qvr). 

*4.25. +: p-=-pvp. 

*4.5. +: 

*4.42. b:- 

If now we call the class of “elementary propositions” the class K, and 
write p+ q for p vq, and p’ for ~P, these propositions become the following: 

*1.71. If p and q are in the class K, then p+ is in the class K. 

*1.7. If p is in the class K, then 9’ is in the class K. 

*4.31. p+q=qtp. 

*4.33. (p+q)+r=p+(q+r). 

*4.25. p=ptp. 

*4.5. pgq=(p’+9q’)’. 

*4.42. p=pqt+pq’. 

But these propositions are precisely the same as the postulates of our 
fourth set for Boolean algebra, except that the sign = occurs in place of 
the sign =. 

It remains, therefore, to examine the properties of the sign = as used in 
the Principia, in comparison with Postulates A, B, C, D governing the use 
of the sign =. 

Here it is necessary to distinguish between the formal statements and the 
informal statements in the Principia. Among the formal statements we find 

*4.2. 
which is the same as Postulate A. 

Another formal statement (in view of *4.01 and *1.01) is 

*4.21. (p=q)’- 

In accordance with the informal statement under (6) in *1, this formula 
means 


“p=q is false or g= is true,” 
and this in turn means 


“If p=q, then g=),” 
which is the same as Postulate B. 


| 
‘ 


300 E. V. HUNTINGTON [January 


Among the informal statements we find under *4.22 that “=” denotes a 
relation, namely the “relation of equivalence,” and that “the relation of equiv- 
alence is reflexive (*4.2), symmetrical (*4.21) and transitive (*4.22).”f 

We have already cited *4.2 and *4.21. In regard to *4.22, the formal state- 
ment of this theorem (in view of *1.01) is 


*4.22. bi: ~(p=q-g=r)- v -(p=7); 
and in accordance with the informal statement under (6) in *1, this formula 
means “p=q-q=r is false or p=r is true.” This in turn means 

“If p=q and g=r, then p=r,” 

which is the same as Postulate C. 

Again, among the informal statements we find under *4.01 the following: 

“If p=q, then g may be substituted for p without altering the truth-value 
of any function of p which involves no primitive ideas except those enumer- 
ated in *1.” 

This comes to the same thing as our Postulate D. 

Hence, if we accept the above mentioned informal statements as a valid 
part of the theory of the Principia, we have the following theorem: 


THEOREM I. With respect to (K, v, ~~, =), the informal system of the Prin- 
cipia is a Boolean algebra. 


Here, from the abstract postulational point of view, K is an undefined 
class; v (or +) is an undefined binary operator; ~(or ’) is an undefined 
unary operator; and = is an undefined relation. Concretely, K may be in- 
terpreted as the class of “elementary functions”; avb as “a or b”; ~a as 
“not-a”; and a=d as “a equivalent to b.” But any other interpretation of the 
symbols (K, v, ~, =) which satisfies the rules laid down would be a valid 
example of the system. 

One further question arises. Since the number of elements in a Boolean 
algebra may be any power of 2, it is interesting to inquire whether there is 
anything in the Principia which restricts the number of elements. 

Among the formal statements we find 


*5.15. p=q: -p=~gq; 
and according to the informal statement under (6) in *1, this formula means 
“Either p=gq is true or p=~g is true.” 


That is, if g is any particular element of the class K, then every other ele- 
ment must be equivalent either to g or to g’; so that there are only two non- 


T In the formal part of the Principia there is a different definition of =, which does not concern 
us here. 


{ 


1933] POSTULATES FOR THE ALGEBRA OF LOGIC 301 


equivalent elements in the system. Hence, if we accept the informal as well 
as the formal statements, we have 


THEOREM II. With respect to (K, v,~, =), the informal system of the Prin- 
cipia is a Boolean algebra containing only two non-equivalent elements. 


APPENDIX IT 
A SET OF INDEPENDENT POSTULATES FOR PRINCIPIA MATHEMATICA 

This appendix contains a revision of the primitive propositions of the 
Principia (Section A), without making use of the equality sign, or the 
Boolean notation “=1.” (Compare Bernstein’s papers of 1931 and 1932.) 

The primitive ideas in this theory are four in number: 

K =an undefined class of elements, a, b, c,--- (K being interpretable as 
the class of “propositions”) ; 

T =an undefined subclass in K (T being interpretable as the subclass of 
“true” propositions, indicated in the Principia by the assertion 
sign +); 

a+b=the result of an undefined binary operation (a+ being interpre- 
table as “a or 5,” denoted in the Principia by avb); 

a’=the result of an undefined unary operation (a’ being interpretable as 
“not-a,” denoted in the Principia by ~a). 

The postulates here assumed are seven in number, the first five correspond- 
ing precisely to “formal,” and the last two to “informal” statements in the 
Principia: 

PostutatTeE 1. If a and b are in K, then a+b is in K. [*1.71] 

PosTuLaTE 2. If a is in K, then a’ is in K. [*1.7] 

PostutaTE 3. If a, b, etc. are in K, then b’+(a+b) is in T. [*1.3] 

Postutate 4. If a, b, etc. are in K, then (a+b)'+(b+<a) is in T. [*1.4] 

Postuate 5. If a, b, c, etc. are in K, then (b’+c)’+[(a+b)'’+(a+c) ] is 
in T. [*1.6] 

PostuLaTE 6. If a+b is in T, then at least one of the elements a and bis in T. 

PostuLateE 7. If a’ is in T, then a is not in T. 

From these postulates the following propositions are deducible as theo- 
rems.t 

+ The proof of 8 follows at once from 6. 

The proof of 9 depends on the following lemmas: 

(a) If a+0 is in T, then b+< is in T. (By 4, 6, 7.) 

(b) If } is in T, then a+4 is in T. (By 3, 6, 7.) 

(c) If ais in T, then a+ is in T. (By (a) and (b).) 

(d) If ais not in T and 6 is not in T, then a+ is not in T. (By 6.) 

(e) If a+5 is not in T, then a is not in T and d is not in T. (By (b) and (c).) 

If 9 were false, we should have, by (e) and (d), (a+a)’+(a+<a) not in T, contrary to 4. The 
proof of 10 (due to Bernays in 1926) is given in 6.10 above. 


| 


302 E. V. HUNTINGTON 


8. If ais in T and a’+b is in T, then b is in T. [*1.1] 
9. If a, a+a, etc. are in K, then (a+a)’+a is in T. [*1.2] 

10. If a, b, c, etc. are in K, then [a+(b+c) ]’+[b+(a+c)] is in T. [*1.5] 

Any system (K, T, +, ’) which satisfies Postulates 1-7 may be called an 
“informal Principia system,” since all the propositions, both “formal” and 
“informal,” in Section A of the Principia are deducible from these seven 
postulates. 

In regard to the definitions of a > b and a=}, the “formal” and “informal” 
statements in the Principia are not in precise agreement. In the “formal” 
part (see *1.01 and *4.01), a> 5 and a=d are defined as elements determined 
by a and b (analogous to a+). But in the “informal” part (and in practi- 
cally all common usage) the symbols > and = are used to indicate relations 
between the two elements. These two usages may be reconciled by defining 
the symbols a>} and a=d as elements, and the words “a implies 6” and “a 
is equivalent to b” as statements about these elements. 

lla. Definition. a> means a’+b. [*1.01] 

11b. Definition. (a implies 6) means (a’+ is in T). 

That is, “a implies b” means that the element a>) is in T. 

12a. Definition. a=b means [(a’+6)’+(b’+a)’]’. [*4.01] 

12b. Definition. (a is equivalent to b) means { [(a’+b)’+(b’+<a)’]’ is in 
T}. 

That is, “a is equivalent to b” means that the element a=} is in T. 

From Postulates 1-7, with the aid of these definitions, the following much- 
discussed theorems are deducible: 

13. If ais in K, and b is in T, then a>b is in T. [*2.02] 

That is, “a true proposition is implied by any proposition.” 

14. If ais not in T and b is in K, then a>b is in T. [*2.21] 

That is, “a false proposition implies any proposition.” 

15. If ais in T and b is in T, then a=b is in T, [*5.1] 

That is, “two propositions are equivalent if they are both true.” 

16. If ais not in T and b is not in T, then a=b is in T. [*5.21] 

That is, “two propositions are equivalent if they are both false.” 

Hence an “informal Principia system,” as above defined, contains only 
two “non-equivalent” elements. 

17. If a’'+(b+c) is in T, then (a’+b)+(a’+c) is in T. [*4.78] 

18. If a’+(b+c) is in T, then at least one of the elements a'+b and a’ +c 
is in T. 

That is, if a implies 6+c, then a implies b or a implies c. 

It can be shown, by definite examples of systems (K, T, +, ’), that 
Theorems 14, 16, and 18 cannot be deduced from the “formal” part of the 


[January 
} 


1933] POSTULATES FOR THE ALGEBRA OF LOGIC 303 


Principia (1-5, 8-10), without the aid of the “informal” statement here 
listed as Postulate 6. (See Example 6, below.) 

The consistency of Postulates 1-7 is shown by any one of the three follow- 
ing examples of systems (K, T, +, ’) in which all seven postulates are 
satisfied. 

Example 0.1. 

K =a class of four numbers, say 0, 1, 2, 3; 

T =the class containing the two numbers 1 and 3; 
a+6=the number given by the table; 

a’ =the number given by the table. 


0 


0 
1 
2 
3 


Example 0.2. 

K =a class of five numbers, say 2, 3, 4, 8, 9; 

T =the class containing the three numbers 2, 3, 4; 
a+b and a’ =the numbers given by the table. 


9 


(Here the selection and arrangement of elements in each of the four com- 
partments of the table is arbitrary, provided the two groups 2, 3, 4 and 8, 9, 
are kept separate.) 

Example 0.3. 

K=a class of two numbers, 0, 1; 

T =the class containing the single element 1; 

a+b and a’=the numbers given by the table. 


1] 


0/0 1] 1 
ij} 1 140 


12 
¢ 
0 12 311 
1 1141/0 
2 121143 
3 11 3/2 
“2223/4 218 
3/12 2 4/4 21/18 
4/3 4 3/4 
84 4 418 
9/13 3 319 913 
| 
| 


304 E. V. HUNTINGTON 


The independence of Postulates 1-7 is shown by the following examples of 
systems (K, T, +, ’), each of which violates the like-numbered postulate 
and satisfies all the other six. 

Example 1. Same as Example 6.1, with T =1. 

Example 2. Same as Example 6.2, with T =1, 3. 

Example 3. Same as Example 6.5, with T =1, 3. 

Example 4. Same as Example 6.6, with T =1, 3. 

Example 5. Same as Example 4.6, with T =1, 2, 3, 4. 

Example 6. Same as Example 0.1, above, with T =1 instead of T=1, 3. 

Example 7. Same as Example 0.1, above, with T =0, 1, 2, 3 instead of 
T =1, 3. 

The last two examples satisfy not only Postulates 1-5, but also Theorems 
8, 9, and 10; so that Postulates 6 and 7 are not deducible from the “formal” 
part of the Principia. This fact is of fundamental importance in any discussion 
of the adequacy of the “theory of deduction” as set forth in the formal part 
of the Principia. 


HARVARD UNIVERSITY, 
CAMBRIDGE, Mass. 


SUFFICIENT CONDITIONS FOR A PROBLEM OF MAYER 
IN THE CALCULUS OF VARIATIONS* 


BY 
G. A. BLISS anp M. R. HESTENES 


1. Introduction. The general problem of Mayer with variable end points 
as proposed by Bliss (V, p. 305) is that of finding in a class of arcs 


(1:1) = yi(x) (m1 mi=1,---,m) 
satisfying a system of differential equations and end conditions 
a(x, y’) = 0 (a=1,--+,m<n), 
= 0 (u=1,-++, p< 2n+1) 
one which minimizes a function of the form 


gla, y(x1), (x2) J. 


Bliss has shown that this problem is equivalent to a problem of Bolza (V, 
p. 306) in the sense that each can be transformed into one of the other type. 
For the problem of Bolza the function to be minimized is 


= glx, y(x2) | + y’)dx, 


and it is clear at once that the problem of Mayer is a problem of Bolza having 
f=0. 

Sufficient conditions for the problem of Bolza have been established by 
Morse (XI, p. 528) and Bliss (XII, p. 271). However the hypotheses which 
they make, in particular that of normality on every sub-interval, imply that 
the function f is not identically zero, and the sets of sufficient conditions es- 
tablished by them are therefore not applicable to the problem of Mayer with- 
out further modification. In view of this fact it is the purpose of the authors of 
the present paper to establish a set of sufficient conditions for the problem 
of Mayer with variable end points. This will be done in two parts, the first of 
which is the paper here presented, dealing only with the special case in which 
the number of end conditions y, =0 is exactly 2n+1. By methods similar to 
those used by Bliss for the problem of Bolza (XII, pp. 261-274) the results 
obtained will be extended to the general case in a second paper by Hestenes. 


* Presented to the Society, April 8, 1932; received by the editors June 9, 1932. 
¢t Roman numerals in parentheses refer to the bibliography at the end of this paper. 


305 


| 

4 

i 


306 BLISS AND HESTENES [January 


The problem considered here is an obvious generalization of the classical 
problem of Mayer and reduces to the latter when the expression to be mini- 
mized is the function g=y,(x2) and the end conditions y,=0 are the condi- 
tions 

— = — Ba = — = — By = 0 
(j= 1,---,m;j =2,---,m), 


the a’s and §’s being constants. Sufficiency theorems for the classical problem 
have been established by Egorov (II, p. 376), Kneser (I, p. 250; VIII, p. 290), 
and Larew (VII, p. 65), who use in each case an m-dimensional field defined 
in the (n+1)-dimensional space of points (x, y1,---, Yn) by an (m—1)- 
parameter family of extremals passing through a fixed point. Such a field 
does not seem to be applicable to the problem considered here, but one can 
use instead a field of »+1 dimensions defined by an m-parameter family of 
extremals in (x, y:, - - - , Yn)-space. The construction and use of such a field 
are important features of this paper. An (w+1)-dimensional field of this sort 
is applicable to the more special classical problem of Mayer also, and a 
fundamental sufficiency theorem for this case can be established in this way 
with greater ease and fewer restrictions than have hitherto been required. 

2. Preliminary remarks. In the following pages it is assumed that the 
various indices have the following ranges unless otherwise explicitly specified: 


i,k 0; a,8=1,2,---,m<mn; 


p,o = 1,2,---,2n+1; r=1,2,---,n—1; 
s=1,2,---,2n—1. 


The tensor analysis summation convention is used freely throughout. We 
make the following hypotheses concerning a particular arc Ej. whose mini- 
mizing properties are to be studied: 

(a) The functions y,(x) defining Z,, are continuous on the interval x xs, 
and this interval can be subdivided into a finite number of parts on each of 
which these functions have continuous derivatives. 

(b) The functions ¢. have continuous partial derivatives of the first 
three orders in a neighborhood & of the values (x, y, y’) on Es, and at each 
element (x, y, y’) in R the matrix ||¢.,;|| has rank m. 

(c) The functions g, ¥, have continuous partial derivatives of the first 
two orders in a neighborhood of the end values (%,, yi, ¥2, Yi) of Ey, in which 
the determinant 
Vox Vovis 


is different from zero. 


1933] THE PROBLEM OF MAYER 307 


An admissible set (x, y, y’) is a set interior to ® and satisfying the equa- 
tions ¢.=0. An arc (1:1) having the continuity properties described in (a) 
is called admissible if all of its elements (x, y, y’) are admissible. The defini- 
tions of equations of variation and of admissible variations used in the 
following pages are those of Bliss (V, p. 307; IX, p. 677). The problem of 
Mayer here proposed can now be more precisely stated as that of finding in 
the class of admissible arcs satisfying the end conditions y,=0 one which 
minimizes the function g. 


I. THE FIRST NECESSARY CONDITION. For every minimizing arc E, for the 
problem of Mayer as here proposed there exist constants c; and a function 
F=).(x) ba such that the equations 


(2:2) Fyy = f F,dx + ci, ¢a = 0 

2 
are satisfied at each point of E,2. The multipliers \.(x) are continuous except 
possibly at the values of x defining corners of E,, and do not vanish simulta- 
neously at any point of Es. 


To prove this theorem one needs only to combine the methods used by 
Bliss for the corresponding theorems in the problems of Mayer (V, p. 311) 
and Lagrange (IX, p. 683). It is also an immediate corollary of a theorem 
established by Morse and Myers for the problem of Bolza (X, p. 245). 


THEOREM 2:1. If the functions da(x) are a set of multipliers with which an 
admissible arc Ej. satisfies the equations (2:2), then for every set of admissible 
variations &,, £2, ni(x) along Es the functions n(x) satisfy the equations 


(2:3) Fyy 


for every interval x'x'’. 


This result is readily provable by multiplying the equations of variation 
Pay; Ni + = 0 
by the multipliers \,(x), adding, and applying the usual integration by parts 
with the help of equations (2:2). 
An admissible arc Ej: is said to be normal relative to the end conditions 


y, =0 if there exist for it 27+1 sets of admissible variations £1, £5, n{(«) such 
that the determinant | ,(é, ”)| is different from zero, where 


W,(é, n) (Yp2, + ya + + (Woz, + yal + 


| 
| 
| 
| 
| 


308 BLISS AND HESTENES [January 


the functions y,, y/ occurring explicitly and in the derivatives of y, being 
those belonging to Ey. The arc E,2 is normal on the sub-interval x'x'’ if there 
exist for it 27—1 sets of admissible variations £;, £5, 7;(x) such that the 


matrix 
nit(x’ 
2:4 

| 


has rank 2n—1. On account of the relation (2:3) this is the highest rank at- 
tainable for a matrix with columns of this sort belonging to an arc that satis- 
fies the equations (2:2) with a set of multipliers \,(x). For convenience an arc 
that is normal relative to the end conditions y,=0 will be designated simply 
as normal. 


THEOREM 2:2. An admissible arc E,2 that satisfies the necessary condition 
I is normal if and only if there exist for it no set of multipliers d(x), not vanish- 
ing simultaneously, with which it satisfies equations (2:2) and for which the 
determinant 


0 0 — Fy (x2) 
Yor, + Vit + Vit Vovie 


vanishes on Ey. If Es: is normal the constant 1, defined below can be taken equal 
to 1, and its multipliers da(x) are then unique. 


(2:5) 


To prove the theorem we first notice that the arc E,2 is normal if and only 
if there exist for it no set of constants and multipliers Jo, J,, Xa(x) having 
1,=0 but not vanishing simultaneously with which it satisfies the relations 
(2:2) and 

Lo(ge, + Yar Buss) + Vit Your) = 9, 

Louis + = Fyy(x), 

+ = — 


This criterion for normality is readily established by the same methods as 
those used by Bliss for the case when £;; is an extremal (V, p. 311). If fora 
set of multipliers \.(x) belonging to E,2 the determinant (2:5) vanishes, then 
there is a set Jo, J,, cha(x) having J)=0 and satisfying the equations (2:6). 
Hence £2 could not be normal. On the other hand if the determinant (2:5) is 
different from zero for every set of multipliers \.(%) with which E,: satisfies 
equations (2:2), then there can be no set /, J,, Xa(x) with J) =0 satisfying the 
equations (2:6). Consequently in this case Ey, is normal. The last statement 
in the theorem is readily established by the methods used by Bliss for the 
case when Ey is an extremal (V, p. 311). 


(2:6) 


_ 


1933] THE PROBLEM OF MAYER 309 


THEOREM 2:3. If an admissible arc Ey, is normal on x'x"’ and satisfies the 
equations (2:2) with a set of multipliers da(x) ,then these multipliers are unique on 
the interval x'x"’ except for a constant factor. 


This is a result of the relation (2:3) which implies that the constants 
Fy, ,(x'), Fy,(x#’’) are unique except for a constant factor since it is possible 
to select a matrix (2:4) having rank 2n—1 on x’x’’. The multipliers belonging 
to Ez on the interval x’x’’ are then also unique except for a constant factor 
since they are completely determined when the set of values F,,-(x’) is speci- 
fied (IX, p. 680). 

3. The family of extremals. An extremal is an admissible arc with a set 
of multipliers not vanishing simultaneously 


yi = vile), Aa = (m <x Sm) 


which have continuous derivatives y/ (x), yi’ (x), Ad (x) and satisfy the Euler- 
Lagrange equations 


(3:1) (d/dx) Fy —Fy, =0, = 0. 
Such an extremal is non-singular if the determinant 


Poy,’ 
0 


Re 


is different from zero along it. Along a non-singular extremal Ey, the equations 


(3:2) Fy y’, d) = da(x, y’) =0 


can be solved for the variables y/ , \. in a neighborhood of the values (x, y, z) 
on the arc Ej. The solution has the form 


(3:3) yi = Z), Na = Aa(x, z), 


and has continuous partial derivatives of the first two orders since the first 
members of equations (3:2) have such derivatives. The system of equations 
(3:1) is now equivalent to the system 


(3:4) dy;/dx = P(x, y, 2), dz;/dx = F,,[x, y, P (x, y, 2), A(x, y, 
The functions F, P;, A. satisfy the homogeneity relations 
F(x, y, 9’, RA) = y, 9’, 
(3:5) Pi(x, y, kz) = Pi(x, y, 2), 
Aa(x, y, kz) = kAQ(x, y, 2) (k ¥ 0). 


The first of these relations is a consequence of the definition of F. The last two 
follow from the fact that the two sets 


M 

| 

| 

i 


BLISS AND HESTENES 


[x, y; kz, P(x, 2), RA(x, z)|, 
[x, y, ke, P(x, y, kz), A(x, y, kz) ] 


satisfy equations (3:2) and must be identical since the solutions P, A of 
these equations are unique when 2, y, z are given. 

Through every element (xo, yo, 20) in a neighborhood of the set of values 
(x, y, 3) on the extremal Ej. there passes a unique solution 


(3:6) yilx, Xo, Yo, Zo), a= 2;(x, %0, Voy Zo) 


of equations (3:4) for which the functions 4;, yiz, 2:, 2i2 have continuous partial 
derivatives of the first two orders since the second members of equations (3:4) 
have such derivatives. The functions y;(x, xo, Yo, 20), k2i(X, Xo, Yo, Zo) are solu- 
tions of equations (3:4), on account of the homogeneity properties (3:5), 
and have the initial values (x, y, 2) =(%o, Yo, kzo). Since the solutions with 
these initial values are unique it follows that 


yi(x, Xo, Yo, = yilx, Xo, Yo, Zo), 


(3:7) 
Xo, Yo, kz,(x, Xo, Yo, Zo). 


Since each curve (3:6) has an initial set at x =2;9 we lose none of them if we 
replace x» by the fixed value 2:9. Furthermore not all the constants 2; are 
zero at the initial element of EZ;2. We may therefore renumber the solutions 
(3:6) so that Zo is different from zero. On account of the homogeneity rela- 
tions (3:7) it follows that the initial elements (2x10, yo, 20), (X10, Yo, #20) deter- 
mine the same curves y; = yi(%, %10, Yo, 20). Hence we lose none of these curves 
if we assign to Zno the fixed value of z, belonging to Ej. at the point 1. Let us 
for convenience rename the constants 0, Yeo, 210) °° * 2n—1,0 
and call them ¢, C2, - - - , Can-1 respectively. The family (3:6) then takes the 
form 


(3:8) Yi = yilx, c), = 2i(x, €). 
The equations 


Ci = yi(x10, C), = Zr(X10, C), = Zn(X10, 


express the fact that the solutions (3:8) pass through the initial element 


(x, Yip * Yny * Sn—1, Zn) (x10, * * * Cutty * * COn—1y Zn0) 
and from them we find by differentiation that the determinant 
it, 0 
(3:9) 
Zic, Zi 


takes the value 2,9 at x=2;9. When we substitute the functions (3:8) in 


310 January 
SC 


1933] THE PROBLEM OF MAYER 311 


equations (3:3) a set of functions A(x, c) is determined, and we have the 
final result: 


THEOREM 3:1. Every non-singular extremal arc Ey.is a member of a (2n —1)- 
parameter family of extremals 


(3:10) Yi = yilx, ©), = €) m) 


for special values (x;, %2, €) = (x10, X20, Co). The functions Yi, Viz, 2i, Ziz, Na have 
continuous first and second partial derivatives in a neighborhood of the values 
(x, c) defining Ey2, and for the special values (x10, Co) the determinant (3:9) is 
different from zero. 


4. The second variation for a normal extremal. Consider a normal ex- 
tremal arc Ej: with ends satisfying the conditions y, =0. Let £1, , ni(x) bea 
set of admissible variations along satisfying the equations W,(é, =0. 
It can be shown that there is a one-parameter family of admissible arcs 


(4:1) yi(x, b), , 


satisfying the end conditions y,=0, containing for b=0, and having 
£,, £2, ni(x) as its variations along E;:(IX, p. 695). The functions x;(b), x2(d), 
yi(x, b), yix(x, b) are continuous in a neighborhood of the values (x, 6) defining 
and their derivatives x15, X10», X20, Viz, Vizdd, Vio» have the same property 


except possibly at the values of x defining the corners of the arc 9;=7;(x) 
(x; Sx in xn-space. 
When the equations 
g(b) = g[xi(b), b), xa(b), y(xa(d), 
0 = ¥,[x1(b), 5), x2(b), 5)], 


are multiplied by constants and multipliers /o, /,, \.(*), where Jo, J, are to be 
determined later and the functions \,(x) are the multipliers belonging to E12, 
it is found by suitable additions that 


log(b) = G[xi(b), y(x(b), b), x2(b), 
0 = F[x, y(x, b), y'(x, da(x)], 


where G=1,g+/,y,. By differentiating these equations for 6 it follows further 


that 
Iog’(b) = (Ge, + Yat Guy) + 


+ (Gz, + + Gy x2), 
0 = + Fy ys, 


H 
1 
{ 
| 


312 BLISS AND HESTENES [January 


and a second differentiation gives for b=0 

= (Gz, + yal + Gy, 
(4:2) + (Gey + + |? ° 

+ n(x), &, n(x)], 

(4:3) 0 = Fy yin + + 20(x, 2, 0’), 
where ( is a quadratic form in the variations £,, 7i(a:), £, ni(x2) of the family 
(4:1) along Ei. and 
(4:4) 2w(x, 0, =Fynine + A 
When equation (4:3) is integrated from x; to %2, it is found with the help of 
the Euler-Lagrange equations (3:1) that 


(4:5) 0 = Fy Vivo + f 2w(x, n, n’)dx. 


From the hypothesis (c) of § 2, and since E;2 is normal, we can determine the 
constants lo, J, to satisfy equations (2:6) with J) =1. Hence by adding equations 
(4:2) and (4:5) it follows that the second variation J; along E,2 can be ex- 
pressed in the form 


(4:6) Tz = g'"(0) = Ql és, n(x), &, + f ” n, n')dx, 


and this expression must be 20 for every set of admissible variations &, £2, 
along Ey satisfying the conditions 7) =0. 

Since Ez is normal the relation (2:3) and Theorem 2:2 imply that every 
set of admissible variations £,, £2, »i(x) along Ej, satisfying the conditions 
W,=0 also satisfies the equations =0. Hence in the 
expression (4:6) the value of the quadratic form Q is always zero, and we 
have the following theorem: 


THEOREM 4:1. Along a normal extremal arc E,2, with ends satisfying the 
conditions y, =0 the second variation is always expressible in the form 


I, = f 2w(x, n’)dx 


for all admissible variations &,, £2, ni(x) satisfying the equations V,=0, where 
2w is the quadratic form (4:4). If g(E:2) is to be a minimum for the problem of 
Mayer as here proposed, then this second variation must be 20 for every set of 
admissible variations n;(x) satisfying the relations 


(4:7) n(x) = ni(xe) = 0. 


1933] THE PROBLEM OF MAYER 313 


Since the functions 7;(x) satisfy the differential equations of variation 
(4:8) &,(x, n; n’) = dayiNi + = 0 


it is clear that the properties of the second variation suggest a minimum 
problem which is a problem of Lagrange (cf. VI, p. 16), namely, that of 
minimizing J, in the class of arcs 


(4:9) ni = ni(x) (m1 S x S m) 


satisfying equations (4:8) and passing through the fixed points (x, 0), 
(x2, 0) in xn-space as indicated by equations (4:7). One readily verifies that 
this problem is abnormal since, as was seen in §2, the rank of the matrix (2:4) 
cannot exceed 2n—1 on E;2. However, by a suitable modification of the end 
conditions the problem can be made normal. For this purpose we replace the 
condition that the arc (4:9) passes through the fixed points (x, 0), (x2, 0) in 
xn-space by the conditions 


(4:10) = = % — = = 0 (i= 1,--+,n;1 Ap), 


where # is chosen so that F,,/(x2) #0. The two sets of end conditions are 
equivalent since the relation (2:3) implies that 7,(x2) =0 whenever the con- 
ditions (4:10) are satisfied. 


To prove that the new accessory problem just described is normal we use 
the fact that since Ei: is normal there is a determinant of the form | ¥,(é,77)| 
which is different from zero on Ej. The matrix of this determinant is the 
product of two matrices, the first of which is formed by deleting the first row 
of the matrix (2:5) and has rank 2m+-1, and the second of which is a matrix 
having 2n+1 columns of the form 


(4: 11) ni (21), &, 


This second matrix must also have rank 2n-+1 if the original deterrainant is 
to be different from zero, and the determinant formed from this second 
matrix by leaving out the row of elements 7,7(x2) must be different from zero, 
as one readily sees with the help of the relation (2:3). This last determinant 
is however one of the form whose non-vanishing insures the normality of the 
accessory problem with end conditions (4:10). 

The Euler-Lagrange equations for the x7-problem are the equations 


(4:12) (d/dx) Q,, 0, ,(x, n’) 0, 


where Q(x, 7’, u) Bz. These equations are known as the accessory 
equations for the original Mayer problem. 


| 
| 
al 
| 
+ 


314 BLISS AND HESTENES [January 


THEOREM 4:2. If the functions uo=1, wa(x) are a set of multipliers with 
which an admissible arc (4:9) for the xn-problem satisfies equations (4:12), 
then every set of functions po=1, pa(x) having this property is of the form po=1, 
Pa(X) =pa(x)+kdra(x), where the functions d.(x) are the multipliers for Ey, and 
k is an arbitrary constant. 


This follows because if po = 1, p.(x) are a second set of multipliers for the arc 
(4:9), then the differences p.(x) —.(x) must be multipliers for the original 
problem and hence be of the form pa(x) —a(x) =kda(x), since is normal. 
This proves the theorem (cf. VI, p. 19). 

An admissible arc (4:9) having associated with it a set of multipliers yo, 
Me(x) with which it satisfies equations (4:12) will also satisfy the transver- 
sality condition for the accessory problem just described if it satisfies the rela- 
tion Qy,’(*2) =0 (IX, p. 693). Since Ey. is normal and F,,,(x2) <0 it follows 
that a solution 7;(x), po=1, pa =Ha(x) +kda(x) of equations (4:12) satisfies the 
transversality condition {,,'(x.)=0 for a suitably selected value of the 
constant k. 

Let us now assume that Ey is also non-singular. Then the determinant R 
is different from zero along E,2, and the equations 


v(x, ry) = $i, #,(x, n’) 


with uo =1 can be solved for the variables n/ , ua. The solution has the form 
ni A(x, n; Ka = M.(x, n, §), 


and the accessory equations (4:12) with wo=1 are now equivalent to the 
equations 

dni/dx = H((x, n, §), 

dji/dx = n, H(x, 2,5), M(x, 2,9), 


which are linear and homogeneous in the variables 7;, ¢;. They have the solu- 
tion 7;=0, ¢;=2,(x), where z;(x) are the values of the derivatives F,, along 
since the corresponding values 7:=0, reduce the first equations 
(4:12) to the Euler-Lagrange equations (3:1). It is known that for equations 
(4:13) a set of 2n—1 solutions u;,, 2;,, whose determinant 


(4:13) 


Uis 0 
(4:14) 

Vis 
is different from zero for one value of x, has that determinant different from 
zero for all values of x. Furthermore every solution (n;, ¢;) of equations (4:13) 
is expressible in the form 


(4:15) Ni = Catlin, Fi = + Ri, 


| 


1933] THE PROBLEM OF MAYER 315 


where ¢,, k are constants (IV, pp. 153-4). One readily verifies that the columns 
of the determinant (3:9) are a set of solutions of equations (4:13) like those 
in the columns of (4:14) (IX, p. 726). 

As an immediate consequence of the relation (4:15) it follows that there 
is one and only one solution (7;, ¢:) of equations (4:12) taking prescribed 
values nio, fio at a given value xo. In particular the only solution taking the 
values nio=fio=0 at x=4Xp is the solution 7;={;=0. Furthermore, since Ej: is 
normal the only solution having 7;=0 on xx, is the solution 7;=0, [;= 
kz;(x). The same is true on a sub-interval x’x’’ provided Ei: is normal on this 
sub-interval. 

5. The necessary condition of Mayer. A value x32, is said to define a 
point 3 conjugate to 1 on Eye if there exists a solution 7;=u,(x), wo=1, 
Ma=Pa(x) of equations (4:12) whose functions u(x) satisfy the relations 
u;(x1) =u,(xs) =0 but are not all identically zero on 213. 


IV. THE NECESSARY CONDITION OF MAYER. Let Ej: be a non-singular nor- 
mal extremal arc, normal on every pair of sub-intervals xx; and %3X2. If Ey. is a 
minimizing arc for the problem of Mayer as here proposed, then between 1 and 2 
on E}, there can be no points 3 conjugate to 1. 


If there were a solution 7;=u,(x), wo=1, wa=Ppa(x) of equations (4:12) 
whose functions u,;(x) vanish at x, and x; but are not all identically zero on 
2%:%3, then for the functions 7;(x), uo, ua(x) defined by the equations 


(5:1) n(x) = u(x), = pa(x) on 
n(x) = 0, Ho= 1, =O on 


the second variation J; would take the value zero (IX, p. 726). It follows that 
the arc 


(5:2) ni = ni(x) (a. S x S x) 


would be a minimizing arc for the xn-problem since E;: is to be a solution of 
the original problem. Hence there would be associated with the arc (5:2) a 
function 2=w+y.%. with which it would satisfy the accessory equations 
(4:12), the transversality condition Q,,-(x2)=0, and the condition that the 
derivatives Q,,(x) are continuous on the interval «x2. As was seen above the 
most general multipliers possible for the functions 7;(x) would have the forms 
HMo=1, on the interval xx; and wo=1, on the 
interval x34. On account of the transversality condition Q,,,(%2)=0 it is 
found that d=0 since F,,.(2) #0. Hence at «=; the corner condition would 
require 


} 


BLISS AND HESTENES 


(x3 — 0) = Wai’ + (pa + Oda) Pay,’ |* = 0. 


It follows that there would exist for the arc (5:2) a set of multipliers wo=1, 
Ma =Pa(x)+¢Xa(x) such that at «=x; the functions ¢;= Q,,-(x, u, u’, p+cd) 
vanish as well as 7;=;. Hence the functions 7;(x), ¢:(*) would all vanish 
identically on x:x; which is not the case, and the theorem is therefore es- 
tablished (cf. VI, p. 18). 

6. The determination of conjugate points. Consider a non-singular, nor- 
mal extremal arc Ej, that is normal on every sub-interval x3. 


THEOREM 6:1. Let u;,, v;, be 2n—1 solutions of equations (4:13) whose 
determinant (4:14) is different from zero at x=2,. A value x;A#x, determines a 
point 3 conjugate to 1 on Ey. if and only if the matrix 


| 


(6:1) 


has rank <2n—1. 


This theorem is a simple extension of a theorem given by Larew and can 
be proved by the same methods (VI, p. 20). 

If now we select 2n—1 solutions 1;,, ;, of equations (4:13), as in Theorem 
6:1, and such that at x =, the functions u;,(x) have the values 


Uir(%1) = 0, = Siz (6; = 1, 5% = 0 for i#k), 


then it is clear that the matrix (6:1) for this set has rank 2n—1 if and only 
if the matrix |{w;,(x;)|| has rank »—1. With this in mind we can prove the 
following theorem: 


THEOREM 6:2. Let uix, Vix be m solutions of equations (4:13) which at x=x; 
satisfy the relations 


Uir(X1) = 0, | 2i(x1) | ~ 0, 


tin(X1) => 2;(x1), Vin(%1) = 0. 


A value x3 x, determines a point 3 conjugate to 1 on Ey if and only if D(x3) =0, 
where D(x) =| wix(x)|. 


The theorem follows at once from our previous considerations if we show 
that D(x;) vanishes if and only if the matrix ||w-(xs)|| has rank <n—1. If 
now D(x;) =0, then there exist constants a;, not all zero, such that u;.(x3)ax 
=0. On account of the relation (2:3) for the functions 7;(x) =u,.(x)a, and 
the values of u;, at x =~, it follows that 


316 [January 


THE PROBLEM OF MAYER 


O = 2;(x3) = = 2:(%1) 


Hence a, =0, and the matrix ||1,,(xs)|| has rank <n—1. The converse is im- 
mediate, and the theorem is established. 

7. Mayer fields and a fundamental sufficiency theorem. The importance 
of the introduction of the notion of an (w+1)-dimensional field in the space 
of points (x, 1,---, ¥,) for the problems of Mayer will be seen from the 
following considerations. 

DEFINITION OF A MAYER FIELD. A Mayer field for the problem considered 
in this paper is a region § in xy-space containing only interior points and 
having associated with it a set of functions p;(x, y), \a(x, y) with the following 
properties: 

(a) they have continuous first partial derivatives in §; 

(b) the sets [x, y, p(x, y)] defined by the points (x, y) in § are all ad- 
missible; 

(c) the integral 


It = f {FG + (ay: — 


formed with these functions is independent of the path in §. 

This definition of a field is precisely the one given by Bliss for the problem 
of Lagrange except for the form of the function F(IX, p. 730). It should be 
noted that for the problem of Mayer here discussed the function F(x, y, p, d) 
vanishes identically in §, which is not in general true for the problems of 
Lagrange. Bliss has shown that the solutions y,;(x) of the equations dy,;/dx 
=p(x, y) are extremals with multipliers \,(x, y(x)), called extremals of the 
field. It is clear that the value of J* is zero along every extremal of the field. 


THEOREM 7:1. If Ey: is a normal extremal arc of a field § with ends satis- 
fying the conditions y,=0, then there is a neighborhood N of the ends of Ey2 in 
(x1yi%22)-space such that for every admissible arc Cx, in § with ends in N satis- 
fying the conditions y, =0 the formula 


(7:1) g(Cu) = ff Ble,» 9)» Me 9), 


holds, where Xo is a suitably chosen positive constant, 
E(x, P; r, y’) wins F(x, d) F(x, d) (yi pi) d), 


and the arguments y;(x), yi (x) occurring in the integrand are those belonging to 
Cu. 


1933] 317 


318 BLISS AND HESTENES 


As a first step in the proof consider the equations 
Yi, Xa, ye) = 8 X2, ‘y2) = 0. 


By hypothesis they are satisfied by the set [m1, 91, %2, ¥2, g(E:2)] belonging to 
Ey. Since the determinant (2:1) is different from zero these equations have 
solutions of the form 


(7:2) = x(g), ya = vals), % = xe(g), = 


which have continuous second derivatives in a neighborhood of the value 
g=g(Ei2). Furthermore, in a sufficiently small neighborhood N of the ends 
of Ey: the only solutions are those defined by equations (7:2). These equations 
define two arcs A, B through the ends of Ey2. 
The equations 

logs, +1 = — PiF 9, D, 

+ = Fy (x, d)| 

logs, + = DiF (%, r)| 


where the variables 21, yi, X2, Vie are replaced by the right members of equa- 
tions (7:2), determine continuous functions /9(g), /,(g). When they are multi- 
plied by the differentials dx,, dyi:, dx2, dyiz belonging to the arcs A, B and 


added, it is found that 
2 
(7:3) lodg Fy (dyi pidx) 
1 


In order to compare the values of g for the arcs Ey, and Cy this last equation 
may be integrated from g=g(E2) to g=g(Cs). By then applying the first law 
of the mean to the left member, an equation of the form 


(7:4) Aolg(Css) — g( Ew) = I*(Ais) — I*(Bas) 


is obtained, where Xo is a suitably selected mean value of the function /o(g) 
on Ey. Since Ey, is normal we may suppose /)>=1 on £2, according to the 
agreement made in §2. Consequently the neighborhood N can be chosen so 
small that /o(g) >0 and hence \)>0 in N. Furthermore, since /* is independ- 
ent of the path in § it is clear that 


(7:5) I*(Ai3) I*( Bos) = I*(Ew) I*(C3,) = I*(C3a), 


the last equality being valid since 7* vanishes identically along the extremal 
Ey, of the field. The theorem now follows at once from equations (7:4) and 
(7:5) since, as is easily seen, the value of —J*(C3)/Xo is equal to the value of 
the second member of equation (7:1). 


| 
[January 


1933] ‘THE PROBLEM OF MAYER 


It is now possible to prove the following important theorem: 


THEOREM 7:2. A FUNDAMENTAL SUFFICIENCY THEOREM. Let a normal ex- 
tremal arc E, be an extremal of a field §. Suppose that the ends of Ey, satisfy the 
conditions y,=0 and that there is a neighborhood N of these ends in (x1y:%2y2)- 
space such that no other extremal of the field has ends in N satisfying the equations 
¥,=0. If at each point of § the condition 


E[x, y, p(x, y), (x, y), y’] > 0 


holds for every admissible set (x, y, y’) #(x, y, p), then the neighborhood N can be 
so restricted that the inequality g(Css) >g(E,2) is true for every admissible arc 
Cx, in & with ends in N satisfying the conditions y,=0 and not identical with 


To prove this, restrict NW so as to be effective as in Theorem 7:1. It follows 
at once from Theorem 7:1 that the inequality g(Cs,) =g(Ei2) is necessarily 
satisfied by every admissible arc Cy, in § with ends in N satisfying the con- 
ditions y,=0. The equality sign is appropriate only when the E-function 
vanishes along Cy, that is, only when y/ =p; at each point of C;,. But in that 
case C3, would be an extremal of the field and would coincide with Ey: since 
Ey. is the only extremal of the field with ends in N satisfying the conditions 


8. An auxiliary theorem. A normal extremal arc Ey, is said to satisfy the 
Clebsch condition III’ if at each element (x, y, y’, \) on it the inequality 


Fy Wille > 0 


holds for every set (Ili, - - - , (0, - - - , 0) which is a solution of the 
equations ¢.,,/I1;=0. The arc E;2 satisfies the Mayer condition IV’ if there is 
no point 3 conjugate to 1 on Ey, between 1 and 2 or at 2. 

In this section we propose to construct m solutions Ui, Vi, of equations 
(4:13) whose determinant | U,:(x)| is different from zero on x2 as stated 
in Theorem 8:1 below. To do this we consider a normal extremal arc E;2 that 
is normal on every sub-interval x,x; and satisfies the conditions III’, IV’ 
just described. From the condition III’ we conclude that Ey is non-singular 
(IX, p. 735). 


Lemma 8:1. There is an interval x,<x<x,+h on which there is no point 3 
conjugate to 1 on Eye. 


This lemma is readily proved by the methods used by Bliss to establish 
the corresponding theorem for the problem of Lagrange (IX, pp. 737-740). 
Bliss makes the stronger assumption that FE: is normal on every sub-interval 
x’x’’, a restriction which is useful if we wish to show that there are no pairs 


319 

¥,=0. | 
| 
1 


320 BLISS AND HESTENES [January 


of conjugate points whatsoever on Ej, defined by values x’x’’ on an interval 
x, Sx <x,+h. It can, however, be replaced by the weaker hypothesis that Ei, 
is normal on every sub-interval 2,x3 if we wish to consider only the points 3 
conjugate to 1 on Ey». 

For every pair of solutions (7;, ¢;), (wi, v;) of equations (4:13) it is known 
that the expression 7;—u,¢; is a constant. If this constant is zero, then the 
two solutions are called conjugate solutions of equations (4:13). A set of m 
mutually conjugate solutions of equations (4:13) is said to form a conjugate 
system of solutions. 

Consider now the system of solutions u;x., v;, of equations (4:13) defined in 
Theorem 6:2. One readily verifies that this system forms a conjugate system 
if the functions v,;,(x) are modified so that they satisfy the relation 2;(%:) 
-Vix(%1) =0. This can be done by adding to the solution ux, 9; suitable mul- 
tiples of the solution 7;=0, ¢:=2,(x). Furthermore, since Ey, satisfies the 
condition IV’ it follows from Theorem 6:2 and Lemma 8:1 that the deter- 
minant |,(x)| is different from zero on the interval x;<* <2. When the 
matrices ||z;:||, ||2:|| are multiplied on the right by the inverse of the matrix 
||10:x(22)|| a new conjugate system 74x, {ix is formed which takes values 6;, 
By at x=2%2, where 6;, equals 0 or 1 according asi¥k or i=k, and By, =B,,;. It 
is clear that the determinant | 7:x(x)| is also different from zero on the inter- 
val x; <x <2. Hence the n-parameter family of solutions of equations (4:13) 


(8:1) ni = = (M1 S 


simply covers a region § of points (x, m1, - - - , mn) whose x-codrdinates lie on 
the interval x;<xS4%. Each arc of this family intersects the hyperplane 
x=, in points whose 7-codrdinates are the parameters a; defining the arc. 
Furthermore, on the hyperplane x=2, the Hilbert integral J,* for the x7- 
problem defined by the family (8:1) takes the form 


and hence is independent of the path. It follows that the family (8:1) defines 
a field § (IX, p. 733), and the following lemma is established: 


Lemma 8:2. If nix, Six is a conjugate system of solutions taking at x=x2 the 
values 5;x, Bix just defined, then the determinant | nix(x)| is different from zero on 
the interval x,;<x S22. Furthermore the n-parameter family (8:1) of solutions of 
the accessory equations defines a Mayer field over a region § of points (x, m, 

- ++, whose x-codrdinates lie on the interval x, <x S x2. 


Lemma 8:3. For every extremal Tx for the xn-problem joining points (x, n) 


Ag 
3q 
74 


1933] THE PROBLEM OF MAYER 


= (x3, 0) and (x, n) = (x2, a), with x; the relation 
(8:2) — 2 0 
holds, where 

n, n’)dx. 


Consider first the case when x3 >. .\ccording to Lemma 8:2 the Hilbert 
integral J,* for the integral J, is independent of the path in §. Hence 


(8:3) = + 


= Bixaiar. 
Ly 


Since I’, is admissible it follows that 


(8:4) To(T 34) — 34) = Eodx, 


where Ey is the Weierstrass E-function formed for the function 22. By the 
use of Taylor’s expansion one readily verifies that the condition III’ on E,, 
implies that Eg20 along I'y4. Hence from equations (8:3) and (8:4) it is 
clear that the inequality (8:2) is true whenever x3>,. If now x;=, then 
I'y is an extremal of the field and by direct integration it is found that 
34) = Hence the lemma is established. 

The following theorem gives us the result described at the beginning of 
this section. 


321 
r 4 

L 
1 3 A 2 | 
| 
| 
‘ 
} 


322 BLISS AND HESTENES [January 


THEOREM 8:1. Let Uix, Vix be a conjugate system of solutions of equations 
(4:13) having at x the initial values 6%, where dix, Bix are 
the values described above. For such a system the determinant | U;x(x)| is differ- 
ent from zero on the whole interval x, Sx Sx, and Hy. = Hii. 


In the first place | Uix(x:)| =1. If now |Ui(x)| vanishes for a value 
23 (%; then there exist constants a;, not all zero, such that U 
=(. The equations 
ni = = 
define an arc I’;, as in Lemma 8:3. By direct integration it is found that for 
this arc 


Io(Ts4) — Buais = (Bix — Six)avax — = — aiai< 0. 
This contradicts the result obtained in Lemma 8:3. Hence | Ux(xs)| is dif- 
ferent from zero on the whole interval x;x2 as was to be proved. 


9. The construction of a field. In order to construct a field we need the 
following theorem: 


THEOREM 9:1. Suppose that an n-parameter family of extremals 
(9:1) Yi = yil%, +, Gn)y Aa = Aa(*%, G1, An) 
is intersected by an n-dimensional manifold 


and simply covers a region § of xy-space containing only interior points. If the 
parameter values of the extremal through the point (x, y) are denoted by a;(x, y), 
then the region § is a field with slope-functions and multipliers 


(9:3) y) = [x, a(x, y)], y) = ha [x, a(x, y) 
provided that the integral I* is independent of the path on the n-dimensional 
manifold (9:2). 

This theorem has been established by Bliss for the problem of Lagrange 
(IX, p. 733). The proof is the same for the problem considered here. 


THEOREM 9:2. If a normal extremal arc Ej. is normal on every sub-interval 
XX, and satisfies the conditions III’, IV’, then Ey, is a member of an n-parameter 
family of extremals (9:1) whose determinant | yia,| is different from zero along 
Ey. Furthermore Ey: is an extremal arc of a field § simply covered by the family. 


To prove this let W(ai, - - - , a) be a function of the form 
(9:4) W(a) = 220; + (1/2)Hix(ai — yi2)(ae — 


| 
q 


1933] THE PROBLEM OF MAYER 323 . 


where the constants ¥j2, 2:2 are the values of the functions y;(x), 2:(x) defining 
Ey at x=%2, and the H;, are the numbers belonging to the conjugate system 
Ui, Vix defined in Theorem 8:1. When in equations (3:6) the set (xo, yio, 
Zio) is replaced by the set (x2, a:, W.,), an m-parameter family of extremals 


yi(x, %2, a, W.) yi(x, a), 


(9:5) a= 2;(x, x2, a, W.) a) 


is defined and contains E\, for the special values a;=+i2. The multipliers 
d(x, @) associated with this family are determined by equations (3:3). 
Furthermore, since each extremal (9:5) defined by parameter values a; has 
on it the element (x2, a:, Wa,), it follows that yic,=5ix, Zia, = Waa,=H at 
x =23. Hence from Theorem 8:1 we conclude that the determinant | y;a,| is 
different from zero along each extremal of the family (9:5). This family, 
therefore, simply covers a neighborhood § of E,2. Moreover, on the hyper- 
plane x =x; the Hilbert integral J* can be expressed in the form 


I* = f Fy,dy; = f W.,da; = f dw 


and hence is independent of the path. Theorem 9:1 now justifies the theorem 
that was to be proved. 


THEOREM 9:3. Let a normal extremal arc E\2 be a member of an n-parameter 


family of extremals (9:1) whose determinant | yia,| is different from zero along 
E,2. If the ends of Ey: satisfy the conditions y, =0, then there is a neighborhood N 
of these ends in (xyy:xX2y2)-space such that Ey, is the only extremal of the family 
with ends in N satisfying the conditions y, =0. 


To prove this let E;: be a member of the family (9:1) for the special par- 
ameter values (%10, 20, do). By hypothesis these values satisfy the equations 


(m1, %2, = Vp a), %2; a)] = 0. 


The theorem now follows at once from implicit function theorems if we can 
show that the matrix 


(9:6) + Vows Woz, + Vovis + VouisVio, (%2)|| 


has rank »+2 on Ey». To do this suppose that it had rank less than +2. Then 
there would exist constants },, be, cz, not all zero, such that the relations 


+ Yat + (Wor, + Vit + + = 0, 
(%1) Via, — Fy (%2) = O 
would hold on Ey. The last equation is precisely the relation (2:3) for the 


i 

| 

f 

| 

| 


324 BLISS AND HESTENES [January 


admissible variations 7; = On account of the normality of Ei: the deter- 
minant (2:5) is different from zero on Ey. Hence we would have 


bi = be = Yia,(X10, Go)Ck = Yia,(X20, = 0. 


But this is impossible since the determinant | y;.,| is different from zero along 
E\.. The matrix (9:6) therefore has rank +2 on Ey, and the theorem is 
established. 

10. Sufficient conditions for relative minima. The condition I is defined in 
§2, the Clebsch condition III’ and the Mayer condition IV’ in §8. A normal 
minimizing arc Ej, is said to satisfy the Weierstrass condition II)’ if at each 
element (x, y, y’, A) in a neighborhood 9 of those belonging to Ej: the in- 
equality 

E(x, y, y’,, ¥’) > 0 


holds for every admissible element (x, y, Y’) #(x, y, y’). 


THEOREM 10:1. SUFFICIENT CONDITIONS FOR A STRONG RELATIVE MINI- 
MUM. Let Ey, be an admissible arc without corners and with ends satisfying the 
conditions y, =0. If Ei, is normal relative to the end conditions y, =0, is normal 
on every sub-interval «x3 of x\%2, and satisfies the conditions I, Ily’, III’, IV’, 
then there are neighboritoods § of Ey, in xy-space and N of the ends of Ey. in 


(x1yi%22)-space such that the inequality g(Css) > g(E12) holds for every admissible 
arc C3,in § with ends in N satisfying the conditions y, =0 and not identical with 
Exe. 


To prove this theorem we first notice that the condition I and the nor- 
mality of E;, imply a unique set of multipliers \.(x) and constants c; with 
which Ej, satisfies equations (2:2) and for which J)>=1, as agreed upon in 
Theorem 2:2. The condition IIT’ implies further that Z,2 is non-singular and 
hence must be a single extremal arc, since it has no corners (IX, p. 735). 
According to Theorem 9:2 we now see that Ej, is an extremal of a field § 
with slope functions and multipliers ~;(x, y), \a(x, y). It follows that if the 
field § is taken sufficiently small, the values x, y, pi(x, y), \a(x, y) belonging 
to it will lie in so small a neighborhood of the sets (x, y, y’, \) belonging to 
E,2 that the condition II,’ will imply the inequality 


E(x, y, p(x, y), Mx, > 0 


for every admissible set (x, y, y’)#(x, y, p) in §. Theorem 9:3 and the 
fundamental sufficiency theorem 7:2 now justify the theorem that was to 


be proved. 


ip 
ti 


1933] THE PROBLEM OF MAYER 325 


Bliss (IX, pp. 736-37) has shown that if an extremal arc Ey, satisfies the 
condition III’ and is an extremal of a field § with slope functions and mul- 
tipliers p(x, y), Xa(x, y), then the inequality 


E[x, y, p(x, y), Mx, > 0 


holds for every admissible set (x, y, y’) ¥(x, y, p) in a neighborhood § of the 
sets (x,y, y’) on Ey2. Hence by arguments like those in the preceding paragraph 
the following theorem is justified: 


THEOREM 10:2. SUFFICIENT CONDITIONS FOR A WEAK RELATIVE MINIMUM. 
If an admissible arc Ey, satisfies all the conditions of the preceding theorem except 
the condition II’, then there are neighborhoods ¥ of the sets (x, y, y’) on Ex, and 
N of the end values (x1, y:, %2, Y2) Of such that the inequality g(C3s) > g(Ei2) 
is true for every admissible arc C3, whose elements (x, y, y’) are all in 8, whose 
ends are in N and satisfy the conditions y,=0, and which is not identical with 
Ex. 


Suppose now that the functions y, are continuous at every pair of 
distinct or coincident points in a neighborhood of those belonging to Ej>. 
Bliss has shown that if the ends of Ey are the only pair of distinct or coinci- 
dent points on Ej, satisfying the conditions y, =0, then for every neighbor- 
hood NW of the ends of Ey): in (x:yi%2y2)-space there is a neighborhood § of 
Ey: in xy-space such that every pair of points (x, y,), (x2, y2) in § satisfying 


the conditions ¥,=0 are also in N (XII, p. 267). Hence by suitably restric- 
ting the neighborhood § of Ey in Theorem 10:1 we have the following 
corollary: 


Coro.tiary 10:1. Let Ey be an admissible arc satisfying the conditions 
described in Theorem 10:1. If further the ends of Ey: are the only pair of distinct 
or coincident points on Ej satisfying the conditions Wy, =0, then there is a neigh- 
borhood § of Ex2 in xy-space such that the inequality g(Cu) >g(Ei2) holds for 
every admissible arc Cy, in § with ends satisfying the conditions y,=0 and not 
identical with 


A similar corollary can be stated for weak relative minima. 


BIBLIOGRAPHY 


I. Kneser, Lehrbuch der Variationsrechnung, Braunschweig, 1900, pp. 227-261. ' 

II. Egorov, Die hinreichenden Bedingungen des Extremums in der Theorie des Mayerschen Prob- 
lems, Mathematische Annalen, vol. 62 (1906), pp. 371-380. 

III. Bolza, Uber den anormalen Fall beim Lagrangeschen und Mayerschen Problem mit gemischten 
Bedingungen und variablen End punkten, Mathematische Annalen, vol. 74 (1913), pp. 430-446. 

IV. Goursat, A Course in Mathematical Analysis, translated by Hedrick and Dunkel, vol. 2, 
Part 2. 


326 BLISS AND HESTENES 


V. Bliss, The problem of Mayer with variable end points, these Transactions, vol. 19 (1918), 
pp. 305-314. 

VI. Larew, Necessary conditions in the problem of Mayer in the calculus of variations, these Trans- 
actions, vol. 20 (1919), pp. 1-22. 

VII. Larew, The Hilbert integral and Mayer fields for the problem of Mayer in the calculus of 
variations, these Transactions, vol. 26 (1924), pp. 61-67. 

VIII. Kneser, Lehrbuch der Variationsrechnung, 2d edition, Braunschweig, 1925, pp. 240-304. 

IX. Bliss, The problem of Lagrange in the calculus of variations, American Journal of Mathe- 
matics, vol. 52 (1930), pp. 673-742. 

X. Morse and Myers, The problems of Lagrange and Mayer with variable end points, Proceedings 
of the American Academy of Arts and Sciences, vol. 66 (1931), pp. 235-253. 

XI. Morse, Sufficient conditions in the problem of Lagrange with variable end conditions, American 
Journal of Mathematics, vol. 53 (1931), pp. 517-546. 

XII. Bliss, The problem of Bolza in the calculus of variations, Annals of Mathematics, vol. 33 


(1932), pp. 261-274. 


OF CHICAGO, 
ILL. 


1 
\ 


THE TOTAL VARIATION OF 4(x+hA)—&(x) 


BY 
NORBERT WIENER and R. C. YOUNG* 


1. There is a fundamental theorem in the theory of Lebesgue integration 
that if f(x) be any function integrable Lebesgue, over the interval (a, b), then 
the integral 


(1) f | fe + — f(x) | de 


tends to zero with hk. This theorem is usually proved by approximating to 
f(x) in terms of continuous functions, for which the property is obvious. 

The integral (1) is the total variation in (a, b) of the function F(«+h)— 
F(x), where 


F(z) = f 


and the quoted property is equivalent to the following statement: 

(I) If F(x) be any absolutely continuous function in (a, b), the total variation of 
F(x + h) — F(x) 

tends to zero with h. 


This result no longer holds if we substitute for F(x) any non-absolutely 
continuous function G(x) of bounded variation. Indeed, it provides a neces- 
sary and sufficient condition for absolute continuity.t This fact may be first 
rendered plausible by taking the simplest case of a discontinuous function of 
bounded variation, and bearing in mind that a general continuous function 
of bounded variation is always a limit of simple discontinuous ones. If we 
assume for instance 

g(x) =aforx Sc, 


g(x) = B for c < x, 


* Presented to the Society, October 29, 1932; received by the editors April 21, 1932. 

¢ This result was proved, by an entirely different method, by A. Plessner, Journal fiir Mathe- 
matik, vol. 160 (1929), pp. 26-32. A study of the total variation of g(x-+h)—g(x) when g(x) is con- 
stant in the complementary intervals of Cantor’s set is contained in an article by Hille and Tamarkin, 
American Mathematical Monthly, vol. 36 (1929), pp. 255-264. The existence of these two papers 
was called to our attention after the present paper had been completed. We notice also an erroneous 
statement in a paper by J. M. Whittaker, Proceedings of the Edinburgh Mathematical Society, (2), 
vol. 1 (1929), p. 232 (Lemma 1). 


327 


328 NORBERT WIENER AND R. C. YOUNG 


we have, for h>0, 
g(x + h) — g(x) =0 for Sc—handx>c, 
g(x +h) — g(x) 


and over any interval (a, 0) containing c and c++ internally, the total varia- 
tion of g(x+h) —g(x) is the sum of the absolute values of its jumps, viz. 


In this special case we may then write, for every /, and every interval (a, 5), 
axXxc, b¥c, 


Actually this is the relation which we shall show (§3) to hold for any singular 
function g(x) of bounded variation, if not for all 4, at any rate for almost all. 
By a singular function we mean a function of bounded variation whose 
derivative is zero almost everywhere, of which the above discontinuous g(x) 
provides a very special example. 

An arbitrary function of bounded variation is the sum of an absolutely 
continuous function and a singular function of bounded variation, uniquely 
defined and called the singular part of the original function. Thus the proof 
of the above statement will carry with it, as an immediate consequence of I, 
the following result: 


(II) If g(x) be any function of bounded variation with singular part (x) con- 
tinuous at the end points a, b,* we have 


b 
lim f | dle(x + fim f | + — | 
h-0 a h-0 a 


2 fi dy(2)| . 


This statement is of course only a rough corollary of the result for singu- 
lar functions and absolutely continuous functions. More precise consequences 
to be born in mind are 


(3) 


(i) the left-hand equality in (3) holds not merely for 4 tending to 0 con- 
tinuously, but for 4 tending to 0 through any discontinuous sequence; 


* As an unessential restriction reducing to 


[January 


1933] TOTAL VARIATION 329 


(ii) the right-hand equality in (3) holds similarly for discontinuous approach 
of h to 0, if a certain set of measure 0 be avoided by h. To obtain any inter- 
mediary or the lower value of the considered limits, we have to take sub- 
sequences of this set of measure 0. 

Examples will be constructed (§§4—6) to show what various possibilities 
exist with regard to the exceptional set of values of 4. It will be seen that this 
set may be more than countable, or countable, or entirely absent. In par- 
ticular therefore the limits in (3) may be unique limits. On the other hand, 
the examples will also show that the lower limits corresponding to the upper 
limits in (3) may in other cases have any lesser non-negative value. The case 
in which the lower limit is 0 would seem to have particular interest, and it 
might be useful to examine in greater detail the particular sequences of hp 
tending to 0, for which this limit is obtained. It seems a question for instance 
whether a non-absolutely continuous function g(x) exists for which hk, =1/n 
would provide a sequence of this kind, or more generally, any other special 
sequence for which h,/hn+: is bounded. 

2. We use the following simple lemmas. 


Lemma 1. Given two functions gi(x), go(x) of bounded variation, of which 
(x) has derivative zero, g:'(x) =0, on the set complementary to a given set Hy 
in (a, 6). Then we have 


b b 
f lates) = + f 20 f | deel, 


We have 


= d\ ge ™ $1 


| + ff | dg,|. 
CH, CH, 

J, = 

CH, 


* This lemma has been stated initially in a less general form. The authors are indebted to Dr. 
S. Saks for the suggestion of extending their proof to the present, essentially more general, situation. 
We refer to de la Vallée Poussin, Intégrales de Lebesgue, Fonctions d’Ensemble, Classes de Baire, 
Paris, 1916, pp. 90-95, as to the notion of the total variation over a set, and as to some formulas 
which we are using here. 


we have 


NORBERT WIENER AND R. C. YOUNG 


ale - g:]| = dgs| . 


On the other hand, over the set H itself, 


ff s f — ff f | deel 
Ay, Ay A, 


Hence, by addition, 


b b 
J 2 ff f f s f 
a 1 a a a 


giving (4) as required. 


Lemma 2, Let E, be a variable set depending on a parameter h, and such that 
each point x belongs to E, at most for a set of measure 0 of values of h*. Then for 
each function g(x) of bounded variation, 


fi =0 


for almost all values of h. (More precisely, the exceptional values of h form a set 
of measure 0 depending on g.) 


This is immediate from the theory of change of order of integration in a 


repeated Stieltjes integral.f Let G(x) be the indefinite total variation of g(x): 


G(x) = fl dg | 


and E(x, h) the characteristic function of E,, equal to 1 in Ey, and 0 else- 
where, for each value of 4. We have then 


V(h) = | dg|= f "B(x, h)dG(a). 


Integrating this with respect to h, we get 


= fiw h)dG(x) = co h)dh, 


* In other words, Ey is the section y=constant 4, of a plane set E whose sections x=constant 
all have measure 0. Such a set E has plane measure 0, and it is well known that Z, must then have 
measure 0 for almost all 4. The present Lemma states that E, has also measure 0 with respect to 
any function of bounded variation, for almost all 4, and is an immediate adaptation of the classical 
result. 

Tt See, e.g., L. C. Young, The Theory of Integration, Cambridge Tracts, No. 21, p. 41 (Theorem 
IV). 


330 (January 


1933] TOTAL VARIATION 


where, by the hypothesis, 


f E(x, h)dh = 0 for each x, 


as the measure of the set of values of # for which x belongs to E,. Thus 
f V(h)dh = 0, 


and hence the non-negative function V(h) vanishes except at most in a set of 
measure 0. 

In our application of this Lemma, the set E, will be the /-translation of a 
fixed set Ey of measure 0, that is, the set of points x such that x—h belongs 
to Eo. It has the characteristic function 


= +h). 
For fixed x this still represents a set of measure 0 in h, and so the hypothesis 
of the Lemma is fulfilled. Thus we have 


Coro.iary. Given any function of bounded variation g(x), and any set Ey 
of measure 0, the total variation of g over the h-translation of Eo vanishes except 
at most for a set of values of h of measure 0. 

3. We have now immediately our 


THEOREM. If g(x) be a function of bounded variation constant in the comple- 
mentary intervals of a closed set H of measure 0, we have, for almost all h, 


b 


For, g(x+h) is then constant in the complementary intervals of the h- 
translation H, of H, so by Lemma 1, for 


g2(x) g(x + h), g(x), 
the two sides of (5) differ by at most 


2 dg | 


and by Lemma 2, in the Corollary form, this vanishes for almost all h. 

In constructing examples, it is simplest to consider periodic functions with 
interval of periodicity (a, b), and assume for instance a=0, b=1. Then (5) 
takes the form 


1 1 
f +» - f | dg(x)|. 


331 


332 NORBERT WIENER AND R. C. YOUNG [January 


In that case also it suffices to consider only positive values of h, since 
g(x+h)—g(x) is then still periodic with the same period as g, and so 


Furthermore we need only consider monotonic functions, with / 0 dg=1. 

4. The classical example of a singular function of bounded variation is 
that of the monotone function constant in the complementary intervals of 
Cantor’s typical ternary set* and representing an even mass distribution over 
this set, in the interval (0, 1). If x be expressed as a ternary fraction 


(6) = a; = 0, 1 or 2, 
tol 


and a, be the first (if any) of its digits equal to 1, the function is defined at x 
to have the value 


(7) g(x) = + a,2-", 


or if there be no digit 1 in (6), then 
(7’) g(x) = 
i=1 


This function is a particular example of a class of functions possessing the 
following property ; this is most easily described by introducing the expression 
“ternary interval of order n” to designate specifically the open intervals of 
length 1/3” whose left-hand end points are the terminating ternary fractions 
of at most m digits, all even: 


Each such interval is contained in exactly one of lower order, and contains 


exactly 2’- intervals of higher order r>n. The intervals of given order are 
of course mutually exclusive and non-abutting. 


Property (A). For each ternary interval (a, 8) of order n, and each ternary 
interval (a’, B’) of order (n+l) contained in it, 
g(B’) — g(a’) 
g(8) — g(a) 


where lis a positive integer and 5 a constant less than 1, independent of the par- 
ticular interval. 


* The set of all non-terminating ternary fractions with even digits only, together with the ter- 
minating fractions whose last digit at most is odd. 


i 


1933] TOTAL VARIATION 333 


Such functions share with the ordinary measuring function m(x)=x a 
property relative to fractions with prescribed digits which we shall use in the 
form of 


Lema 3. If g(x) be any function, constant in the complementary intervals 
of Cantor’s ternary set Ho, and possessing the property (A), and if E be a subset 
of Hy in which (in the ternary scale) 


(6) *102°°* 
has fixed prescribed digits (0 or 2) for an infinity of indices, 


(the other digits being arbitrary, 0 or 2), then 


f ae| = 0. 
E 


Consider the ternary intervals of order m; and left-hand end points 
*BiBe Bny—1n, * * * * * On, aS above, B; = 0 or 2, 


and let a; denote their sum-set. Then E ¢ o; for each k (actually E =lim;...0:). 
Also 


(8) 


This is because each interval of o; is in a different ternary interval of order 
n,—1 contained in o,_;, and by property (A) contributes not more than 6 
times the variation over this interval to the total variation over o;,. Thus if 
[k/1] be the integral part of &/1, 


1 
f f \als f | ag| owith 1/k 
0 


since 6<1. 
This lemma is applied in conjunction with another relating purely to the 
ternary set Ho: 


Lemma 4. To each non-terminating ternary fraction h = - a\a2---a;-- - 
there corresponds an infinite sequence of indices 


and two specifications for the whole sequence of digits {cn,} (0 or 2),.which a 
ternary fraction x = - C\C.- - -¢;- - - must certainly satisfy if it is to belong to 
both Hy and its h-translation 


334 NORBERT WIENER AND R. C YOUNG [January 


This means that the common part HoH; is the sum of two sets of the kind 
considered in Lemma 3. What we prove precisely is as follows: 


(i) If # has an infinite number of odd digits 
then the corresponding digits of x must be alternately 0 and 2, i.e. 
Cn, = = Cn, = = Oor?, 
= = Ong = = 20r0.~ 
(ii) If # has only a finite number of odd digits, 
a; fori > WN, 


and (r;) are the indices for which a;=0, (s;) those for which a;=2, after the 
Nth, then in x we must have either 


We deduce this from the following obvious facts, true for any /, terminating 
or not: 
Given in the ternary scale 


h = +,02+++aj--- (a; = 0, 1 or 2), 
x= (b; = 0 or 2), 
x+h = (mod 1) (c; = 0 or 2), 
we have 
(a) if dn, = Gn, = 1, a #1 for m <i < m, 
then eit her 
bn, = 0, bn = 2 (and Cy, = 2, Cr, = 0), 
or the same with m and m, interchanged; 


(b) if a =b=0(= Cr), a,=6b,=2 (= Cs), 


then a;=1 for some index between r and s (implying | r—s| >1). 

In (a) we consider two consecutive odd digits in 4, and affirm that the 
two corresponding digits in x (and x+/) cannot be both 0 or both 2. In fact 
b,,=2 would imply that we had in the formal addition ++ to carry 1 from 
the meth place right back to the mth, where it would compound to 2 with a,,, 
and imply b,, =0 if c,, is to be even. And similarly 5,,=0 implies 6,, =2. 

In (b) we consider two places in each of which / and x have the same 


i 

Cry = Op = =O, = =0 

| or 

Cy = Cy = HO, = 2. 


1933] TOTAL VARIATION 335 


digits, but different in the two places, and we conclude that between those 
two places / must have at least one digit = 1. We may assume that between 
those two places no further coincidences occur, and we see at once that if the 
formal addition «+/ is to yield only even digits between those two places, and 
h had none =1 there, these digits in x+-/ would all be 0 with 1 to carry or all 
2, with nothing to carry, and a 1 would appear in the sum in the earlier of the 
two places r and s. 

From Lemmas 3 and 4, we deduce that, for any function g(x) of period 1, 
constant in the complementary intervals of Cantor’s set Hy and possessing the 
above property (A), we must have, whenever h is a non-terminating ternary frac- 
tion, 


ff | dg| =0. 
cA HoH, 


From this and Lemma 1, we deduce (as in the proof of our principal the- 
orem), that for all such functions g(x), 


whenever h is a non-terminating ternary fraction. 

The exceptional values of / thus belong to the countable set of terminating 
ternary fractions. 

5. In the case of Cantor’s function (7, 7’), property (A) holds with /=1 
and 6 =}, and equality sign. This provides therefore an instance of a function 
g(x) for which the exceptional values of # are at most countable. It may be 
seen moreover that in this case every terminating ternary fraction is an ex- 
ceptional value of 4. For this it suffices to remark that (5) can certainly not 
hold when g(x+) —g(x) is constant in an interval in which neither g(x) nor 
g(x+h) is constant. And if, when 


h = a; "Any 
we choose (as we can) 
Xo = ++ ++ dy, -b; = 0 or 2, 


so that x)+4 has also only digits 0 or 2, then the ternary interval of order n 
and left-hand end point x» satisfies this condition.* If we go into the question 
more nearly, in order to investigate the lower limit of 


f | d[g(x + h) — g(x)]| as 


* Cf. also Hille and Tamarkin, loc. cit., p. 261. 


a 
4 
+ 
| 
i | 
| 


336 NORBERT WIENER AND R. C. YOUNG 
we find that for our present function, whereas 
1 
0 


for h non-terminating, we have for / terminating of exactly m digits, 


(9) 1s f |dle(e + g(a)]| 24-2, 
0 


with actual equality on either side for suitable values of h and each n. For let 
h = Gn, = 0,1 or 2, a, 


and suppose to fix the ideas that a, =2. Then every ternary interval of order 
n and left-hand end point 


+ bn, b; = O or 2, 


with b, =2, translates through / into an interval in which g is constant, that 
is to say is itself an interval in which g(x+/) is constant. Similarly if a,=1 
and we take b, =0. So over half the ternary intervals of order u, the relation 
(5) certainly holds. Similarly it holds for all the intervals of length 2-" and 
left-hand end points - bib, - - - 6, with b, =1,# <n, over which g(x) is constant. 
There only remain the other half of the ternary intervals of order m, over 
which g(x) has variation 1/2, and g(x+h) at most 1/2. So in this case the 
right-hand side of (5’) cannot exceed the left by more than 1. 

The other inferences follow also very simply. For instance 


Gn = = = = 1, = Oor2 forn— ky Ci<cn— kj, 
j =1,2,---,rt+1, Ro = 0, = 


according as 7 is odd or even, give some of the (everywhere dense set of) values 
of # for which equality occurs on the right of (9), while 


a; = 2 for i<n, a4,=1 or a =0 fori<n, a,=2 


give the only values for which equality occurs on the left. 

6. The classical monotone function just discussed expresses the total 
mass in (0, x) due to mass unity, evenly distributed among the ternary 
intervals of order m, for each n, and its constancy in each complementary 
interval of the ternary set Hy expresses the fact that there is no mass in these 
complementary intervals. 

We now modify this construction so that the function is constant not only 
in the complementary intervals of Ho, but also in a set (everywhere dense on 
H,) of the ternary intervals themselves, and show that we can choose these 
additional intervals of constancy (in a more than countable number of ways) 


li 
? 


1933] TOTAL VARIATION 337 


so that the resulting functions satisfy (5) for every terminating ternary frac- 
tion h. The functions will all possess the property (A), and therefore satisfy 
(5) for non-terminating values of 4, and so we shall have our example of 
functions for which / has no exceptional values, (5) being always true. 

The definition is obtained inductively by supposing the mass distribution 
effected in the ternary intervals of order 2m and now dividing the mass of 
each such interval equally among three (instead of all) of the four ternary 
intervals of order 2(n+1) which it contains. The choice of these three in- 
tervals may be effected in four different ways, and as there will be exactly 
3" ternary intervals of order 2m actually containing mass, this means that 
we have 4-3" different modes of distribution to choose from at each stage. In 
any case, each ternary interval of order +2 will contain at most 1/3 of 
the mass of the interval of order m containing it (viz. either 0 or exactly 1/3 
of it). 

As the mass of any interval is the increment of the monotone function 
g(x) representing the ultimate mass in (0, x), we see that each function so ob- 
tained possesses the property (A) of page 332 with /=2, 6=1/3. 

The construction provides moreover a more than countable set of func- 
tions of this kind, for all of which (5) therefore holds whenever / is a non- 
terminating ternary fraction. By eliminating a certain minority of these 
functions, we may arrange so that (5) holds also for each of the remaining 
countable set of values of k. The simplest way of doing this is to associate 
with each of these values of h, i.e., terminating of say & digits, an infinite 
sequence of indices >k: 


(h) 
Ni=Ni-© 


such that 


# NS (all i, 


whenever h’ +h (this can always be done in any number of ways*), and to 
stipulate that for each n»=N,™, two ternary intervals of order 2” which are 
h-translations one of the other (for that #) should never both be intervals 
devoid of mass. For instance if 


h = dQ, a; = 0,1 or 2, a, 
and 


* For instance if {4;} be the considered set of values of 4 in any countable order, and { ps} the 
sequence of primes in increasing order, we can associate with each /; the sequence of integers Pipi in, 
n=No, no+1, eee, 


i 

ig 

ii 

} 

} 

| 

| 

i 

4 


NORBERT WIENER AND R. C. YOUNG [January 


= ++ +++ bony = 0 or 2, 


is the left-hand end point of a ternary interval of order 2” devoid of mass, and 
if x+A is still a left-hand end point of a ternary interval of order 2, i.e. 


x +h = bon, = 0 or 2, 


then this latter interval must not be devoid of mass. Since 2 >k +1, the two 
ternary intervals considered are necessarily in two different intervals of order 
2(n—1), so that at the stage »—1 in our construction we can certainly make 
our choice (in at least 3*" different ways) so that the condition be fulfilled 
for that 2. By the fact that to each index m that we have particularly to take 
into consideration corresponds only one k (with reference to which the re- 
striction is made), a question of incompatibility for the different values of 
cannot arise, and we are certainly left with a majority of the functions con- 
sidered. For this residue, we see that for each h, (5) holds outside at most two 
of the ternary intervals of order 2” in each ternary interval of order 2(m—1) 
for each n=N,™. Thus if o; represent the intervals of this order 2 for which 
(5) may possibly not hold (more precisely in which neither g(x) nor g(x+h) 
is constant), we have as in the proof of Lemma 2 (inequality (8)), 


flalssf lal 


and hence this variation tends to 0 with 1/z. Thus (5) holds in the comple- 
mentary intervals of a set over which g(x) has total variation 0, that is, holds 
absolutely, for the arbitrary considered terminating /. 

7. The object of rarifying the exceptional values of 4 as much as possible — 
was achieved by an increased concentration of the unit mass distributed over 
our interval (0, 1), introducing additional intervals of constancy for g(x). 
Conversely, we can multiply the exceptional set of values of 4 by breaking up 
the intervals of constancy, and diffusing the total mass more over the whole 
interval. 

By this means we shall obtain a function with a more than countable set 
of exceptional /, and at the same time we shall find our example of a case in 
which the lower limit corresponding to (3) is 0.* 

Let m, m2,- ++ be a sequence of integers increasing so rapidly that 
Yr 1/n, converges. Let x be any number in (0, 1). It is a theorem due to 
Cantor that « may always be expressed in the form 

m3 


(10) + +: 


* The actual construction is adapted from a suggestion of Mr. A. S. Besicovitch. 


338 


1933] TOTAL VARIATION 339 


where m,<m, m2:<m2, - - - , and that this representation is unique, save for 
the ambiguity 


1 — 1 4 — 1 


which gives a non-terminating alternative to any terminating expression of 
a number. Let us assume that every ; is even, equaling 2»,. If x is expressed 
as in (10), let 
m;/2  m2/2  ms3/2 
f(x) = + +> a 


V1 V1V2 VV2V3 

if every m;, is even, and 
m2/2 an a/ m,/2} +1 


Vi Viv2°** * Vn-1 Vn 


if m, is the first odd m,. Clearly f(x) is a monotone function, constant over 
every interval 


( + 1 + 2 ) 
where < vx. 
Ne Ny 


These intervals for any given value of & fill up half the line, and for & =1, 2, 3, 
- ++, K fill up a set of intervals of measure 1—2-*. 
Let hk be defined as 


where every a; is 0 or 2. We have 


a2 a3 


fla + h) — f(x) =—~+ 


2” 


whenever 


2S m, Sn, — 2, 2 S S M1 — 2,°°-, 


where 7 is the first index for which a, ~0. The total variation of f(x) over the 
set of points for which m,<2 does not however exceed 1/r;, and the same is 
true of the total variation of f(x+) over those intervals. A similar result 
applies to the range ,—m,<2. Thus the total variation of f(x+/) —f(x) 
over these intervals does not exceed 


if 
i 
ie 

a a2 ak 

{ 

j 


340 NORBERT WIENER AND R. C. YOUNG 


which is hence an upper bound for the total variation of f(x+) —f(x) over 
(0, 1). The total variation of f(x) over the same interval is 1. Thus we have 
here a set of values of / of the power of the continuum including arbitrarily 
small values, for which 


1 1 


and the total variation on the left-hand side tends to zero (since (11) does 
so) when h tends to zero through this set. 


MASSACHUSETTS INSTITUTE OF TECHNOLOGY, 
CAMBRIDGE, Mass. 

Girton COLLEGE, 
CAMBRIDGE, ENGLAND 


ON A SERIES OF INVOLUTORIAL CREMONA 
TRANSFORMATIONS OF SPACE DEFINED 
BY A PENCIL OF RULED SURFACES* 


BY 
VIRGIL SNYDER 


1. Introduction. The transformations here discussed are of interest on 
account of the peculiar manner in which the fundamental elements enter; 
these singularities have not been mentioned in the existing literature. 

Consider a pencil of surfaces | F,| having a rational curve r to multiplicity 
n—2 as part of its base curve. Let the points M of r and the surfaces of the 
pencil | F| be in (1, 2) correspondence. A point P fixes a surface F(p) of the 
pencil and this in turn a point M onr. The line PM meets F(p), in one residual 
point P’. The relation between P, P’ is an involutorial birational transforma- 
tion of space. The case in which the surfaces F are of order , and the base 
curve an ”—2 fold line and a single residual curve, have been considered by 
Carroll.f The case in which the curve r is of order >1 and the residual curve 
simple, including the (ruled) quartics through a double cubic curve r, has 
been discussed by Black.t 

The present paper discusses the possible cases in which every surface of 
the pencil is ruled. With the exception of the pencil of quartics through a 
double cubic curve and of the cubics with a common double directrix, the sur- 
faces of the pencil are of order 2%+-m; the base curve consists of a line p to 
multiplicity 2(m—1)-+-m and a double curve r of order m meeting the line in 
n—1 points. Apart from the multiple directrix line and the double curve, the 
residual base of the pencil consists entirely of generators; each surface of the 
pencil accounts for m parasitic lines. 

The new transformations include a number of well known types, but also 
many new ones in which m, n, k may each take any positive integral value. 

2. Equations. The rational curve r defined parametrically by 


= fr, Xe = fu, = x =h, 


wherein f, g, 4 are binary forms in X, u of degree n —1, n, m respectively, meets 

the line p:21=0, x. =0 in n—1 points. Let a point M=(0, 0, 23, 24) on p, and 

a point (A, on r be in (2, m) correspondence defined by 27 w =0, 
* Presented to the Society, December 29, 1932; received by the editors, October 10, 1932. 


¢ American Journal of Mathematics, vol. 54 (1932), pp. 707-717. 
t These Transactions, vol. 34 (1932), pp. 795-810. 


341 


Anti 
ij 
WULLES OF LIRERA! Apt 


ach 
t 
i 
| 
BOSTON 
at 


342 VIRGIL SNYDER [April 


wherein ~, v7, w are binary forms of degree m in X, u. Lines joining correspond- 
ing points generate a ruled surface. Since x,u =42X, if we write 


f(%1, %2)x%3 — x2) = fxs — g and fuy—h =n, 
its equation has the form 
(1) F = &u + inv + 
The surface is of order 2+-m; it contains r to multiplicity 2 and to mul- 
tiplicity 2(n—1)-+-m. From every point of p issue m generators apart from p, 


and the line p counts for 2(m—1) generators, two for each point of intersec- 
tion of r and p. 
Let F’ =0 be of the same kind, in which wu, 2, w are replaced by u’, v’, w’ 
of the same degree m, and f, g, h be the same as before. Let surfaces of the 
pencil 
(2) iF’ =0 
and points M of p be in (k, 1) correspondence defined by zs¢,(/, 1’) —2:¢s(1, 
l’) =0, each ¢; being a fixed binary form of degree k. 
A point (y) = (91, ¥2, Ys, 4) in space uniquely fixes the surface of the pencil 
(2) passing through it: /=F(y), l’=F’(y), %=¢s(F(y), F’'(y)), 
F'(y)). A point (x) on the line joining (y) to M has coordinates of the form 
px, = = TY2, pX3 = + pX4 = + 

For the point (y’) in which the line meets F, =0 again, 
= — + 2a(u’w)} + n{za(u’w) + 2a(wo’)}], 
o = — (E24 — + 2En(u'w) + *(wo')], 

wherein (u’v) =u/v—wv’, etc. 

The relation between the points (y) and (y’) is an involutorial Cremona 
transformation J of space. The factor &2,— 123 divides out of the transforma- 
tion. When 2s, 2, are fixed, £z,—z;=0 represents a cone containing r and hav- 
ing M on # for vertex. It meets each surface of the pencil (2) belonging to M 
in the generators passing through M, and in the base lines # and r of the 


pencil. 
We may now write, after removing the factor, 


= flzs{t(u'v) + n(u'w)} + + n(o’w)}], 
o = — [E{ E(u’) + n(u’w)} + + n(o'w)}]. 


Under J every surface of the pencil is transformed into itself. The invariant 
points are the points of contact of tangents from M to the k surfaces belonging 
to M. But since all the surfaces of the pencil (1) are ruled, if any point is fixed, 


1933] INVOLUTORIAL CREMONA TRANSFORMATIONS 343 


the entire generator passing through the point is fixed. These lines generate 
the surface ¢ =0. The surface r =0 is the image of the line p. It consists of two 
parts, f=0, images of the »—1 points of r on #, and of a ruled surface, image 
of the other points of ». The generators of each surface of (1) passing through 
its associated point M are fundamental lines of the second kind. When (y) is 
chosen on any such line, the point (y’) is not defined, but is the whole line 
passing through the point (y). As M describes r these lines describe a ruled 
surface R defined by 


(3) R: &}4 — nos = 0, 


which plays the most important part in the transformation. 

3. Transformation of the pencil of planes (p). Every plane through # is 
transformed into itself. In each such plane the involution is of order 2k+2; 
it is of the non-perspective Jonquiéres type, having the isolated point on r as 
fundamental point of order 2k+1. The plane meets each surface of (1) in p 
and in two lines which are interchanged by J. Among them are two lines 
which are invariant point by point; the locus of these lines is the surface 
o=0. The class is k, as is also evident from the definition of the space trans- 
formation. The image of the point Z on r consists of 2k+-1 lines, all belonging 
to r=0. The images of the lines of the plane are curves of order 2k+2 having 
2k+1 common tangents at the fundamental point LZ. There are also 2k+1 
simple fundamental points of the plane transformation on #. The lines joining 
these to L on ¢ are all generators of the surface R. Of the 2(k+1)(m—1) 
+(k+2)m tangent planes through p which belong to a general surface of the 
web of conjugates of the planes of space, 2(+1)(m—1)+ém are fixed for 
every surface of the web; of these 2(k+1)(m—1) are the planes of f=0 each 
counted to multiplicity 2(4+1), and the km other ones are tangent planes of 
R, defined by y3¢1—‘ys63 = 0. The 2m variable tangent planes of the conjugate 
of the plane (ax) =0 are defined by 


x3 { a3(u’v) + aq(u'w)} + + a,(v'w)} = 0. 


Moreover, all the surfaces of the web also touch each other along every one 
of the 2k+1 sheets through r. These are defined by y32s—+y423=0. There is 
simple contact along 7 and r, including all the sheets of each, and f counted 
2(k+1) times. 

When a generator in any plane x through p passes through M, it is para- 
sitic, that is, a fundamental line of the second kind. The point M is then the 
conjugate of the other generator in 7. Thus, to each point M of p correspond 
mk lines, fundamental of the first kind, and the line # is itself multiply 
parasitic for every point on it. 


4) 
i 
| 
eae 
if 
| 
[eal 
|, 
| 


344 VIRGIL SNYDER [April 


4. Residual base elements. The base of the pencil (1) consists of 7 to 
multiplicity 2, of p to multiplicity 2(n—1)+m, and of 4(m—1)+4m genera- 
tors. Of these, 4(m—1) consist of tangency along # in the planes of f=0 and 
4m are generators not coincident with p. Let g; be one of these latter. base 
generators, and 7; the plane #, g;. Corresponding to every point M on p are 
k residual generators in 7; associated with M, all belonging to a pencil of lines 
with vertex on r. As M describes /, these lines generate 7; in such manner that 
the complete image of g; is 7; counted & times. 

There are k+1 positions of M for which the residual generator on some 
surface of the pencil (1) passes through it; these are all parasitic lines and are 
base lines of the web of surfaces conjugate to the planes of space. The com- 
posite surface r =0 consists of the planes f =0, images of the n—1 points of r 
on #, and of a ruled surface of order (2k+1)u+(k+2)m, having p to mul- 
tiplicity (2k+1)n+(k+2)m—(k+1) andr to multiplicity 2k+1. The locus 
o =0 of invariant points is a ruled surface of order 2(n+m), has p to multi- 
plicity 2(n—1)+2m, and r to multiplicity 2. Since all the fundamental ele- 
ments are included in r=0, ¢ =0, the complete configuration can now be ac- 
counted for. 

5. Table of characteristics. The images of planes and of fundamental 
elements can now be expressed by the following table: 


gi ~ 
= G2(n¢m)2 724mg 4 kmeg’ ; 


J = — (wo’)?] 


All the surfaces of the web have the same tangent planes along all the 2k+1 
sheets through r. These are the tangent planes of the ruled surface R of 
fundamental lines of the second kind. The line # illustrates Montesano’s 
theorem for exceptional fundamental lines of the second kind.* 

Thus, the complete intersection of any two surfaces of the web consists of 
the curve conjugate of a line, of the basis elements , r each to the multiplicity 
indicated, simple contact along each sheet of r=0 along 4, of R along r, of 
4m(k+-1) lines and of each sheet of f counted 2(&+1) times. 


* D. Montesano, Sulla teoria generale delle corrispondenze birazionali dello spazio, Rendiconti della 
Accademia dei Lincei, (5), vol. 27; (1918), pp. 396-400 and pp. 438-441; and (5), vol. 30; (1921), 
pp. 447-451. 


1933] INVOLUTORIAL CREMONA TRANSFORMATIONS 345 


6. Types not in the preceding category. The preceding list includes all 
possible types for arbitrary values of m and m, but for particular values others 
may appear. 

A pencil of quadrics and an arbitrary line #, not a basis line, lead at once 
to a series of transformations which include one discussed by Montesano.* 

The line p may be replaced by any rational curve. The congruence of 
bisecants of the base C;, is left invariant. 

The next case is that of a pencil of cubic ruled surfaces having a common 
double directrix. The residual is then a rational quintic 7; meeting the double 
directrix in four points. The point M is now on 7;. The residual section of a 
plane through d and M consists of a generator through M on each of the k 
surfaces associated with M. Each of these lines is parasitic. Every point P of 
d is invariant except for the plane containing a generator through P. 

Let d be x;=0, x,=0. Then u=0 etc. being 
planes, Ff =x?u’+--- =0. 

Let ux,—Ax; =0. Then the parametric equations of 7; are 


in which ¢ and s are quintics and f a quartic binary form. Since f is a factor of 
7, the image of d includes the 4% surfaces of the pencil associated with the 
points in which 7; meets d. The image of the line d is the surface R=23¢,(F, F’) 


—x43(F, F’) =0, of order 3k+1 containing d to multiplicity 2k+1, and 7 to 
multiplicity k. Every generator is parasitic, hence R also appears as a factor 
in the transformation. Given a point P on r;. The image of P on F(M) is the 
residual point P’ in which the line PM meets F(M). As M describes 1;, this 
line describes a rational quartic cone, and the locus of P’ is a curve of order 
4k+WN having P to multiplicity VN. The tangent plane to F(M) at P meets r; 
twice at P and in three other points K. Conversely, given K, then the line 
KP and the tangent ¢ to r; at P uniquely fix a tangent plane to F, hence the 
point M. The (1, 3k) correspondence between M and K on a rational carrier 
has 3k+1 points of coincidence, hence P is of multiplicity 3k+1, and the 
curve is of order 7k+1. In addition, any point M on 7; is transformed into the 
conics through it in the tangent plane to each F(M) at M, residual to the 
generator gu. This makes the complete curve of order 9k+1. Hence r ap- 
pears to multiplicity 9k+1 on the surfaces conjugate to the planes of space 
in the involution. This can also be seen directly from the equations of the 
transformation: Consider a general point P on d. It remains invariant on 


* Su una classe di trasformazioni razionali ed involutorie dello spazio di genere arbitrario n e di 
grado 2n+1, Giornale di Matematiche di Battaglini, vol. 31 (1893), pp. 36-50. 


| 
} 
4 
4, "4 
fill 
i, 


346 VIRGIL SNYDER [April 


every F(M) of the pencil except when PM is the generator gy on F(M). This 
happens only for the four positions of M on d. The surface R meets any F of 
the pencil in d counted 44+2 times, in r counted & times, and in a single 
generator. Let D be one of the points (d, r). Any plane through d meets 
F «ap in d* and a generator g. The image of the generator is D. Hence Fa is 
the image not of the line d but of the point D on d. 


~ Sores gid 14D 
d ~ 


D ~ kF3:d?*r*4D?*., 


The contacts along d and r are as in the general case. 

Various particular cases may arise, when 7; is composite, consisting of one 
or more generators and a residual curve. When the generator is taken as a 
projector curve, the result is included in the preceding category; when the 
residual curve is the locus of M, the image of each generator is a surface of 
order 6, found as in the general case. The order of the transformation is 
lowered by unity for each base generator, since the plane d, g will divide out. 

A third particular case is that in which the pencil of quartics have a com- 
mon double cubic curve. This has been fully treated by Black.* 

The space cubic may be replaced by a line and a conic meeting it in one 
point, or by three lines, one of which meets each of the other two, which are 
skew. The common transversal is a double generator. When either double 
directrix is used as projector, this is included in the general case, whether the 
basal double generator exists or not. When the double generator is used as 
projector, it is included among those treated by Carroll, the residual basis 
line now consisting of four simple generators, all skew to the double one. But 
this case offers differences that warrant a more detailed treatment, since no 
other generators meet g, hence there is no surface of parasitic lines apart from 
the planes d, g and d’, g. 

Let d=x,=0, x2=0, d’=2x.=0, x,=0 be the two double directrices, and 
g=x,=0, x2=0 the double generator. The equation of the quartic has the 
form Fy=x?u+21x%2x30+-axP x? =0, wherein u is a binary quadratic form in 
X2, X4, v is linear in %2, x4, and a is constant. 

Let u’, v’, a’ define F{ having the same double elements. The pencil 
IF'—I'F =0 is then associated with the point M=(0, 0, 2s, 24) of g by the 
relation z34(1, 1’) —2s3(/, 1’) =0, wherein as before ¢; is a binary form in /, 


* Loc. cit. 


1933] INVOLUTORIAL CREMONA TRANSFORMATIONS 347 


l’ of degree k. The residual base of the pencil consists of four generators g;, 
which meet d and d’ but do not meet g. The pencil contains two composite 
surfaces, one consisting of the plane x, =0 and a ruled cubic containing d’ as 
double directrix, and the other of x2=0 and a ruled cubic containing d 
as double directrix. These planes both divide out, each reducing the order of 
the transformation by one. 

The conjugate of the double generator g is generated by the residual 
conics to surfaces of | F | at M. It is a surface of order 8k+4, has d, d’ and g 
each to multiplicity 44+2, and the simple basic generators g; each to mul- 
tiplicity 2k+1. This surface is ruled. 

The surface o =0 of invariant points is of order 4k+-5, has d, d’, g each 
to multiplicity 2k+1, and the lines g; each to multiplicity k+1. 

The image of g; in the plane Mg; consists of k nodal cubic curves. As M 
describes g, these curves generate a surface of order 4k+1, containing d, 
d’, g each to multiplicity 2k, g; to multiplicity +1, and the other basic gen- 
erators to multiplicity k. The images of d, d’ are the planes (d, g), (d’, g) 
respectively. The transformation is now completely defined. 


CorRNELL UNIVERSITY, 
Irmaca, N. Y. 


di 
| 
tg 
| 
44 
| 


NOTES ON THE THEORY AND APPLICATION 
OF FOURIER TRANSFORMS. I-II* 


BY 
R. E. A. C. PALEY anp N. WIENER 


INTRODUCTION 


We propose to publish under the above title a series of notes. The results 
are of a varied nature, but the methods we employ are very similar and 
consist, roughly speaking, in conformally mapping the unit circle into a half 
plane, and considering the Fourier transforms of functions defined on the 
boundary of the half plane. The notes may be read independently. 


I. ON A THEOREM OF CARLEMAN 


1. The chief object of this note is to give a simple proof of the following 
theorem which is substantially the same as one due to Carleman.f Let 
Ao=1, Ai,---,A,,--+ bea set of positive numbers, and let C4 denote 
the set of functions defined in the interval (— ©, ©), infinitely many times 
differentiable in that range, and satisfying the inequalities 


(*) J “| f(a) S Brat 


where B is a constant which may depend on f(x). We say that the class C4 
is quasi-analytic if a function of C4 is defined completely over (— ©, ©) by 
the values of its derivatives f(x) (v=0, 1, 2,---) at a single point 4, or, 
what is the same thing, if the equations 

f® (x0) = 0 (v 0, 1, 2, 
together with the condition f(x) ¢C.4, imply that f(x) vanishes identically. 
The theorem is the following: 


THEOREM I. A necessary and sufficient condition that C4 should be quasi- 
analytic is that the integral 


x” dx 
1 ] 
J, 


* Presented to the Society, February 25, 1933; received by the editors January 19, 1933. [Mathe- 
matical] science has suffered an irreparable loss in the untimely death of R. E. A. C. Paley. He was 
killed on April 7, 1933, at the age of twenty-five, in an accident that occurred during a skiing excur- 
sion near Banff, Alberta. J. D. Tamarkin.]} 

¢ T. Carleman, Les Fonctions Quasi-A nalytiques, 1926. We have slightly modified Carleman’s 
definition of C4. We consider /“, |f®(x) |*dx instead of max_<,<, |f(x)|, the problem in this 
form being more adaptable to our attack, but the difference is not at all essential. 

348 


FOURIER TRANSFORMS 349 


should diverge, or, what is the same thing, that the least non-increasing majorant 
of the series 


v=0 
should diverge. 


The equivalence of the two conditions has been established by Carleman 
in his book.* In this paper we shall concern ourselves only with the first one. 
2. We begin by proving the following theorem. 


THEOREM II. Let (x) be a real non-negative function not equivalent to zero, 
defined for —2x<x< @, and of integrable square in this range. A necessary 
and sufficient condition that there should exist a real- or complex-valued function 
F(x) defined in the same range, vanishing for x=xo for some number xo, and 
such that the Fourier transform G(x) of F(x) should satisfy | G(x) |=(x), is 
that 


| log | 
(2) dx < 


We observe that the theorem is similar to one due to de la Vallée Pous- 
sin.} He concerns himself with the Fourier coefficients of a periodic function 
all of whose derivatives vanish at some fixed point. Here we demand rather 
more of the function, and we undertake actually to fix the modulus of the 
transform, subject of course to the convergence of (2). 

3. Suppose first that the integral (2) converges. We write for z=x+iy, 
y>0, 

A(z) = — 


? 


1 
f og $(x’)y 


(x — x’)? + 
which is harmonic in the half plane y >0. Let u(z) be its conjugate, and write 
h(z) = exp [d(z) + in(z)]. 

It is well known, by an argument of the Fatou type, that, for almost all x, 


lim A(x + iy) = log (x), 


or, what is the same thing, 
lim | h(x + iy) | = $(x). 
y0 


We observe first that, by the convexity property, 


* Carleman, loc. cit., pp. 50ff. 
t Carleman, loc. cit., pp. 76 and 91. 


it 
¥ 
par 
| 
his 
‘ 
i 
i 
ite 


R. E. A. C. PALEY AND NORBERT WIENER 


$(x’)y 
h(x + iy)| = &) <s — —f 
| iy) | wee 


and tends to zero as y> ©, uniformly in x. This shows that h(x+7y) is uni- 
formly bounded in any half-plane y= yo >0. 
Next 


| Me + iy) — fas Ge dx 


and is therefore uniformly bounded in y. Now let 0<yo<y<y,. Cauchy’s 
theorem gives 
— 2nih(x + iy) 
h(a’ + iyo) wer h(x!’ + 
(x — 2’) + i(y — yo) (x — x’) + i(y — 


K(N + iy’) h(— N + iy’) dy 


Making first V and then y, tend to infinity in the last formula we obtain 
* h(x’ + iyo) : 
- dx’. 
(% — x’) + i(y — yo) 


(3) h(x + iy) = — (2ni)~* 
Now let H,(£) denote the Fourier transform 


A 
Licm. h(x + iy)e*** dx 


A-@ —A 
of h(x+iy). Since the Fourier transform of 
— (2ni)- [x + i(y — 


is 


for negative ~, and vanishes for positive &, it follows that 


[April 
350 


FOURIER TRANSFORMS 


0, t>0. 


Thus we have H,(¢) =0 for £>0 and all positive y, and this gives 


tH, < 0, 
vo(§) { 0, > 0. 


Let us keep y fixed and make yp tend to zero. Since 


f | h(x + iyo) |2dx 
is bounded, it follows that 


0 

(4) | 
is bounded and increasing as yo decreases to zero, and thus the integral (4) 
tends to a limit as yp 0, and H tends to H in the mean of 
order 2. The Fourier transform of the function which coincides with 
for — 0 <£<0, and which vanishes for is h(x+iyo), and 
hence (x+7yo) tends in mean of order 2 to a function G(x) as yo—0. We have 
shown that the Fourier transform of h(x+iyo) (with yo fixed and positive) 
vanishes for >0, and it follows that the same is true of the Fourier trans- 
form F(£) of G(x). We have already seen that |G(x)| =¢(x). 

Now suppose that F(x’) vanishes for x’ >x9, where we may suppose with- 
out loss of generality that x»=0. We are to show that the integral (2) con- 
verges. We write 


N 
G(x) L.i.m. F(x’ 


—N 


N 
¥(z) = Lim. F(x’ )e-**'dx’, Yz > 0. 
N-o 


—N 


The function y/(z) is readily seen to be analytic in the half plane $¥z>0. Sup- 
pose that we invert the half plane $z>0 into the circle |{| <1, ¢=rei*, and 
that G(x) becomes I'(e) and ¥(z) becomes 7(f). Then it is easily seen that 


| |2d0 = 2f = dx, 


so that I certainly is of class L?. Also a simple sae shows that if 
is the inverse of x’+7y’, then 


1933] ee 351 
| 

9 
| 

aM 

if | 


352 R. E. A. C.. PALEY AND NORBERT WIENER {April 
1 — 2rcos(@— +7’ (a — x’)? + 
= mf Li.m. f F(é)e~**dé 
(x om x’)? _N 


(2 — + y" 


lim 


° F(é)d— 
lim f f 
= (x x’)? + y”? 


0 
lim f = iy’) = +(re'*), 


so that y is in fact the Poisson integral of I'(e**). Then 
(2) f log +| | dé 


(S) 
< f | y(rei*) < f | |2d0. 


It is known by a theorem of Ostrowski* and Nevanlinna that the bounded- 


ness of the integral (5) implies that of the integral 


(2)! | log | y(re*) | |da. 


Since finally log |-y(re*) | tends almost everywhere to log |I'(e**) | we have 


f | log | || da < 


and inverting again to the half plane this implies that 


| 1og| || 


1 + x? 


which is the same as (2). 
4. Returning now to the proof of Carleman’s theorem we observe first 
that if the integral (1) converges, and 
oo —1 
= ora + 
v=0 A? 


* See, e. g., A. Ostrowski, Uber die Bedeutung der Jensenschen Formel fiir einige Fragen der kom- 
plexen Funktionentheorie, Acta Szeged, vol. 1 (1923), pp. 80-87. 


FOURIER TRANSFORMS 


| log | 


dx < f o(x)*dx 
x 


so that we can find a function F(x) ¢ Z? which vanishes for x>0 but not 
identically and with its Fourier transform G(x) satisfying | G(x) | =¢(x). 
Finally we have 


| F(x) = fll G(x) 


x” 


-1 
=) 


= f f [10(1 + x?) 
Thus the divergence of the integral (1) is certainly necessary for the quasi- 
analyticity of C4. 

Suppose now that f(x) vanishes with all its derivatives at x =0, but does 
not vanish identically. We are to show that the integral (1) converges. Let 
F(x) be identical with f(x) for negative x and vanish identically for positive x, 
and let G(x) be its Fourier transform. Then, assuming, as we may, that, in 
formula (*), B=1, we have 


A? 2 f | f(x) |%dax = f F(x) = f | G(x) 


E[ 


2r+1 —1 2r+l 
|) rae | )s 2 f | log (2-¥/2| G(x) | ) 


It follows that 


f loa < 2f <f | log (2-1/2 | G(x) |) |dx 
1 1 r? J 


2/2 dr 


< 2 f | tog (2-1/2 | G(x) lax f 


{2-1/2 r? 


< 20 log (2-1/2 | G(x) |) <0, 


1933] ee 353 

then 

if 

4 

log ( =) 
v=0 A? 

: q 

v=0 

2 

Hence 


354 R. E. A. C. PALEY AND NORBERT WIENER 


Thus the divergence of (1) is also sufficient for quasi-analyticity. 


II. ON CONJUGATE FUNCTIONS 


1. We prove the following theorem. 

THEOREM. Let f(0) be an odd function defined in (—71, 7) and suppose that 
it is absolutely integrable and non-decreasing in this range. Let 7 (0) be the con- 
jugate function. Then (0) ¢L. 


We may assume, without loss of generality, that, in addition to the above 
properties, f(@) satisfies the condition of assuming only integral values, for 
it differs by at most 1 from a function having all these properties. Then the 
function conjugate to this step function will differ from f (0) by a function 
which is certainly of class L’. 

2. Let us invert the circle |z| <1 into the half plane ¥Z>0, so that the 
point e** inverts into X(@)=tan 6/2. Suppose that f(@) inverts into m(X). 
Then n(X) is also non-decreasing in (—, ©), and 


n(X 


1 + X? 


Let the points +A,, A1S\eX - ~~, be those at which (X) increases by 1 
(if m(X) increases by more than 1 then X, is taken an appropriate number of 
times). We may assume without loss of generality that \, is never zero, or, 
what is the same thing, that (X) is zero in the neighborhood of the origin. 


Then the convergence of the second integral (1) implies that of the series 


(2) 


n=1 


3. Now consider the branch of the function 


(3) ~ ¥ tog ( —) 


v=l 


which takes the value 0 at the origin, and is regular in the half plane $Z 20, 
Z~# +X, (the function (3) certainly exists in virtue of (2)). We observe that 
the real part of the function (3) coincides with m(X) on the real axis, and is 
indeterminate at the points \,. Further it may easily be seen that the imagi- 
nary part of the function (3) on the real axis differs only by a constant 


(1-55) 


ci=- ni f dX 
+ 


[April 
since 


FOURIER TRANSFORMS 


(1 ~)] + ct = f Hoa 0 

from the transform to the conjugate 7 (6) of f(@). Thus it is sufficient to estab- 
lish the existence of the integral 


7S we = (1-55) 


The integral (4) does not exceed 


in virtue of (2), and our theorem is proved. 

4. We observe finally that the restriction that f(@) should be an odd func- 
tion is an essential one. For suppose that the theorem were true without this 
restriction. Let f(@) be a function which is positive and which increases in 
(—7, 1), satisfying the two conditions 


f f logt f(0)d@ = 


Then, on the assumption that the theorem is true in the extended form, we 
should have 


@. 


But, by a theorem of M. Riesz,* the last inequality, together with the con- 
dition {(@) >0, implies that 


[10 < @, 


giving a contradiction. 


* Shortly to be published in the Journal of the London Mathematical Society. 


MASSACHUSETTS INSTITUTE OF TECHNOLOGY, 
CAMBRIDGE, Mass. 


1933] 355 
i 

i 

| dX it 

2 

1+ xX 

| dx 

© 

Xx? 

| 

dat 

fal 

if 


PFAFFIAN SYSTEMS OF SPECIES ONE* 


BY 
JOSEPH MILLER THOMAS 


This paper begins a study of the minimum number of differentials in 
terms of which a Pfaffian system can be expressed, together with the related 
subjects of reduced and canonical forms for such systems. The particular case 
treated here is that in which the minimum number of differentials exceeds the 
number of equations by unity. Such a system is said to be of species one. After 
passive (completely integrable) systems it is the simplest type. It is charac- 
terized by the following property: the adjunction of a single (suitably chosen) 
equation to it gives a passive system. 

It is shown how to reduce any system of species one to a form involving 
the minimum number of differentials. 

A system is called nested if the above reduction can be effected simulta- 
neously for it and all of its derived systems. It is shown how to reduce any 
nested system of species one to a canonical form. It results that any such sys- 
tem can be written as the sum of a set of special systems, and that it is charac- 
terized by a finite set of arithmetical invariants. There is also given a neces- 
sary and sufficient condition for the existence of a nested system of species 
one having a given set of integers as invariants. 

A further study of systems of species one in relation to their derived 
systems requires a theory of systems of higher species, which it is hoped to 
develop in a subsequent article. 

The methods employed are largely those developed by Cartan and ex- 
pounded by Goursat, with whose bookt we suppose the reader is familiar. 

1. Generalities on the species of a Pfaffian system. A Pfaffian is a form 
linear in the differentials of a finite set of variables, the coefficients being 
analytic functions of the variables. A Pfaffian system is obtained by equating 
to zero a linearly independent (and therefore finite) set of Pfaffians. 

The variables are denoted by x', ---,x". 

The class of a system is the minimum number of variables in terms of 
which a system equivalent to it can be expressed. A system comprising a 
single equation is always of odd class 20+1. The minimum number of dif- 
ferentials which can appear in an equivalent equation is known to be o+1, 


* Presented to the Society, December 30, 1931; received by the editors January 20, 1932. 
t Goursat, E., Legons sur le Probléme de Pfaff, Paris, 1922. 


356 


PFAFFIAN SYSTEMS OF SPECIES ONE 357 


and is consequently determined by the class. This is not true in general for 
systems comprising more than one equation. 
Consider the family of varieties 


(1.1) f'(x) = const.,- f*(«) = const., 
one of which passes through every point of a region. The rank of the matrix 
of 
ax 
is assumed to be k. If the above family is integral for the Pfaffian system 
(1.2) 0,---, = 0, 


the w’s must be linear homogeneous combinations of the df’s; and conversely. 
Hence, the minimum number of f’s defining a family of integral varieties one 
of which passes through every point of a region is the same as the minimum num- 
ber of differentials in terms of which an equivalent system can be expressed. 

Let the minimum number of differentials for system (1.2) be denoted by 
r+o. The non-negative integer o so defined will be called the species of the 
system.* The justification for this name is that the species enables us to 
differentiate between systems not differing in class or genus. Thus for every 
system of two equations of class five, expressed in characteristic variables, 
the genus, as defined by Cartan, is unity, whereas the species may be either 
one or two. 

A function which is not identically equal to a constant and whose con- 
stancy is implied by (1.2) is called an integral of that system. Sometimes by 
ellipsis this term is also used to designate an integral variety, which we shall 
always designate by its full name, reserving the name integral for the concept 
to which it was attached by Poincaré. 

The maximum number of independent integrals which the system (1.2) 
can possess is r and is attained only when the system is passive,} i.e., when it 
is equivalent to a system of the form 

* The notion of species is inherent in the paper of E. von Weber, Theorie der Systeme Pfaff’scher 
Gleichungen, Mathematische Annalen, vol. 55 (1902), pp. 386-440. Although he does not mention 
the invariance of the minimum number of differentials, von Weber considers the possibility of reduc- 
ing a given system to a form containing a specified number of differentials. In the terminology of the 
present paper, he gives necessary and sufficient conditions that the species do not exceed a specified 
value for systems whose Stufe (=class minus number of equations) does not exceed six. 

t The usual term is completely integrable or in involution. We prefer to extend Riquier’s terminol- 
ogy to Pfaffian systems. The justification of this lies in the fact that a Pfaffian system passive in the 
sense just defined is equivalent to a system of partial differential equations which is passive in 
Riquier’s sense. 

In the general theory of partial differential equations, the term passive is applicable to systems 


for which an existence theorem has not been proved and for which the name completely integrable 
would be at least temporarily a misnomer. 


i} 

4t 
| 


358 J. M. THOMAS 
(1.3) dx! =0,---, dx" =0. 


Passive systems are therefore systems of r equations whose class is r. Their 
species is zero. 

It is known that the differentials appearing in a non-passive system can 
always be made fewer than the variables by at least unity. A consequence of 
this is that no system of r equations can be of class r+1, and that the maxi- 
mum value of the species for a system of class pis p—r—1. 

2. The maximum system of species zero implied by a given system is de- 
termined by a maximum set of independent integrals. The latter can be 
found from the last derived system of (1.2). A more direct method of finding 
it is furnished by 


THEOREM 1. The integrals of the Pfaffian system 
wi =0,---,w =0 


are the non-trivial solutions of the linear homogeneous partial differential equa- 
tion of the first order 


(2.1) w!---wdf =0. 

The proof is almost immediate. Since the equations of the Pfaffian system 
are independent, we have 
(2.2) £0. 


Equation (2.1) is then a necessary and sufficient condition* for the existence 
of multipliers \ such that 


(2.3) df = +--+ + rw". 


This proves the theorem. 
Suppose f! is a non-trivial solution of (2.1). If we have also 


(2.4) w?--- wdf! = 0, 


from (2.3) and (2.2) we deduce that the A, in (2.3) is zero. If all r expressions 
of which the left member of (2.4) is the prototype were zero, equation (2.3) 
would show that f' is a constant, contrary to hypothesis. It is only a matter 
of notation, therefore, to assume that f! does not satisfy (2.4). 

When the notation has been properly adjusted, system (1.2) is accordingly 
equivalent to 


(2.5) df! = 0, #7 = 0,---,w =0, 


* Cartan, E., Bulletin de la Société Mathématique de France, vol. 29 (1901), p. 250. 


1933] PFAFFIAN SYSTEMS OF SPECIES ONE 359 


and the integrals of (1.2) and (2.5) are the same. But the latter are by The- 
orem 1 the solutions of 
(2.6) w*---wdfidf = 0, 


where f' is given and fis to be determined. 

If (1.2) has more than one integral in its complete set, system (2.6) will 
have a solution f? for which df'df? #0. From the discussion given above in the 
case of f', we know that adjusting the notation will make 


w--- wdfidf? 0, 
and (1.2) is equivalent to 
df! = 0, df? = 0,0 = 0,---,w =0. 
If another integral f* exists, it is a solution of 
w--- wdfidfdf = 0. 
We continue until we reach an equation which has only a trivial solution. 


In this way, the complete set of ¢ integrals is found. At the same time, we 
have a method of writing (1.2) in the form 


(2.7) dx! = 0,---, dx? = 0, wt! =0,---,w" = 0, 
which puts in evidence its maximum system of species zero written in the 
form (1.3). If g=r, the system is passive and the method reduces it to the 


form (1.3). 
3. Determination of the species. The system 


(3.1), w=0,---,w =0,df'=0,---,dft* =0, 
where 
(3.2) wl. ++ wld frtl... dft* 


is passive if and only if (1.2) can be expressed in terms of r+ differentials. 
The problem of finding a minimum set of differentials is therefore equivalent 
to that of finding a minimum passive Pfaffian system which implies the given 


one. 
If we put 


(3.3) wi- ++ ww’? = Oe, 
the conditions of passivity* of (3.1) are 


(3.4) Qedfrt)... dfrtk = 0. 


* Cartan, E., Lecons sur les Invariants Intégraux, Paris, 1922, p. 101. 


4 
ait 
al 
1 
vi 
ta 
a 
| 


360 J. M. THOMAS [April 


To determine the species, therefore, we form (3.1) and (3.2) for k=0, 1, 
- - until the first consistent system is reached. The & at this stage is the 
species. 

4. Reduced form for systems of species one. A system (1.2) is in reduced 
form if it involves the minimum number of differentials and is solved for r 
of them. The set of variables whose differentials appear in a reduced form 
will likewise be described by the adjective reduced. 

Consider a system of species one expressed in the minimum number of 
differentials. Since the system is not passive, there are at least two of the 
r+1 differentials whose vanishing is not implied by the system. Call them 
dx* and dx'+!, Algebraically considered, (1.2) is a linear and homogeneous 
system of rank r in r+1 unknowns dx, and the unknown dzx‘t! in particular 
can be chosen arbitrarily. Hence (1.2) can be written in the form 


(4.1) dx! — Aldx’t!=0,---, dx’ — = 0. 

If (1.2) is written in the form (2.7), the system 
(4.2) wt =0,---,w =0 
can be put in the form (4.1) because g of the r+1 differentials can be elimi- 
nated by means of dx'= - - - =dxt=0. Hence (1.2) can be written as 
(4.3) dx! =0,---,dx? = 0,dxet! — Actdytt = 0,.--, dx" — = 0, 
where the A’s are all different from zero, and the maximum system of species 
zero is 
(4.4) dx'=0,---,dx?=0. 

5. Reduced variables for systems of species one. If k=1, system (3.4) is 
(5.1) = 0 (a = 1,2,---,7), 
and is equivalent to a system of linear homogeneous partial differential equa- 
tions of the first order in a single unknown function f. From §3, if (5.1) has a 


solution not satisfying (2.1), the system (1.2) is of species one or zero. Hence 
we have 


THEOREM 2. A non-passive Pfaffian system (1.2) is of species one if and only 
if the auxiliary system (5.1) has a solution which is not an integral of (1.2). 


Let f’+! be a solution of (5.1) which does not satisfy (2.1). The system 
(5.2) w=0,---,w =0,dft'=0 


is passive. Its integrals are a set of reduced variables for (1.2). 
It was seen in §4 that iff, - - - , f+! constitute a set of reduced variables 


1933] PFAFFIAN SYSTEMS OF SPECIES ONE 361 


any one of them which is not an integral of the system can be made to play 
the role of x**' in the discussion of reduced form. Adjoining dx**+'=0 to a 
reduced form of the equations obviously gives a passive system of r+1 
equations. Hence any reduced variable which is not an integral can play the 
role of f*+! in (5.2) and will therefore satisfy (5.1). Since the integrals ob- 
viously satisfy (5.1), all the reduced variables satisfy (5.1). 

If (5.1) has only r+1 independent solutions, any r+1 independent solu- 
tions will be a set of reduced variables because any set of reduced variables 
will be functions of those r+1 solutions, and the differentials of the reduced 
variables will be linear homogeneous combinations of the differentials of the 
r+-1 solutions, so that the system can be expressed in terms of the latter set 
of differentials. Hence we have 


THEOREM 3. For a system of species one the auxiliary system (5.1) has at 
least r+-1 independent solutions. If it has exactly r+-1, any set of r+-1 indepen- 
dent solutions is a set of reduced variables. 


An example is 
(5.3) dx! — x*dx5 = 0, dx? — x'dx' — x‘dx5 = 0. 


The auxiliary system is 


af af af 
x1 — 


= 


ox? 
It has just three solutions: x!, 2°, x?—x1x. By the use of them, system (5.3) 
can be written in the reduced form 


dx! — = 0, d(x? — x!x5) — (xt — 2?x5)dx5 = 0. 


The auxiliary system can, however, have more than r+1 solutions. An 
example is furnished by the system 


(5.4) dx! — x*dx* = 0, 


whose auxiliary system has the r+2 solutions 2', x?, x*. 

Since the forms (3.1) are of degree r+2, except when they are zero, they 
cannot have more than r+2 linear factors. Consequently, the number of inde- 
pendent solutions of the auxiliary system never exceeds r+2 unless (1.2) is 
passive. When this maximum number of solutions is attained is stated by 


THEOREM 4. The auxiliary system for a system of species one has r+2 
independent solutions if and only if the class is r+2. 


If the class is r-+2, let x!, - - - , x*+* be characteristic variables. When the 
forms (3.1) are expressed in terms of these x’s, they become 


| 
| 

| 

| 


362 J. M. THOMAS [April 
(5.5) = Bedx!.--.- dx7t? (a = 1, 2. r), 


because they are of degree r+2 in r+2 differentials. Hence the auxiliary sys- 
tem (5.1) has the r+2 x’s for solutions. 

Conversely, let x’, - - - , x*+* be independent solutions of (5.1). Equa- 
tions (5.5) then hold. From them we find the characteristic system* of (1.2) 
to be 


dx'=0,---,dxt? =0. 


Hence the class is r+2, and the theorem is proved. 

When the auxiliary system has r+2 solutions, r+1 of them chosen at 
random do not necessarily form a set of reduced variables. Thus x! and x? do 
not form such a set for (5.4). When one solution f+", other than an integral, 
has been found for the auxiliary system, the set of reduced variables is com- 
pleted by finding the integrals of (5.2). This can be accomplished by the 
method developed in §2; for example, by solving 


(5.6) w! wd fdf = 0. 
For the example (5.4) with f*+! =x’, system (5.6) is 


of which a solution is «'—x*x*. Hence a set of reduced variables containing 
the variable x* is x*, x!'—x°x*, and the equation can be written 


d(x! — + = 0. 


To put a given system in reduced form, write it first in the form (2.7), 
and then consider the auxiliary system for (4.2). When one solution of the 
latter has been found, the set of reduced variables can be completed as indi- 


cated above for (5.6). 

6. Properties of the reduced form for systems of species one. For a sys- 
tem in the form (4.3) the derived forms w’, with dx', - - - , dx" eliminated by 
means of (4.3), are 


(6.1) = — = — 
where 0 denotes the differential formed on the assumption 


(6.2) x! = const.,---, «+! = const. 


Now one of the 0A’s must be different from zero. Otherwise all the derived 


* Cartan, loc. cit. in §3 of this paper. 


0 
Ox' = Ax? 


1933] PFAFFIAN SYSTEMS OF SPECIES ONE 363 
forms would be zero and the system would be passive. Hence the charac- 
teristic system contains the equations 

dx! =0,---,dxt1 =0, =0,---, 047 = 0. 


Its rank, the class p of the system, is accordingly the number of independent 
variables in the set 


(6.3) 


The rank of the matrix 


OA 


Oxrt2 


OA’ 


9 
Oxrt2 Ox” 


is therefore p—r—1. 
The number of linearly independent solutions of 


is the number of unknowns minus the rank of (6.4). This number, increased 
by q corresponding to equations (4.4), gives the number of equations 7! 
in the first derived system: 


(6.5) r+i-—p. 


7. The genus of a Pfaffian system. The genus of a Pfaffian system, as 
defined by Cartan,* satisfies the equation 


(7.1) 


where the sum on the right extends over all the characters s. The number y 
is the maximum dimensionality of a non-singular integral variety. 

In studying Pfaffian systems, however, it is customary to consider changes 
of variables which do not preserve dimensionality, i.e., the number of dimen- 
sions of the representative space may change. Hence y is not an invariant 
with respect to the transformations considered. This leads to some confusion, 
which is perhaps only increased by defining a “true” genus. 

On the other hand, the non-zero characters are invariant under the trans- 


* Cartan, E., Sur l’intégration des systémes d’équctions aux différentielles totales, Annales Scien- 
tifiques de l’Ecole Normale Supérieure, (3), vol. 18 (1901), p. 262. 
fT Cf. Goursat, p. 362; Cartan, p. 291. 


ii 
i 
aA ett 
ax" 
i 
iid 
| 


364 J. M. THOMAS [April 


formations in question. Consequently, the left member of (7.1) is invariant. 
Its significance is the minimum number of independent relations defining a non- 
singular integral variety. Its least value is r. We shall write 


(7.2) 


Because of its invariance, we shall employ g rather than y, and in order not 
to multiply names needlessly, we shall call g the genus rather than y. The 
least value of g is zero and occurs for a passive system. For a non-passive 
system, the value of y computed in characteristic variables is at least one. 
Hence, the genus of a non-passive system satisfies the inequalities 


(7.3) O<gsp-r-il. 


An integral variety defined by fewer than g++, relations is necessarily 
singular. 

8. The genus of systems of species one. The system* whose rank is the 
character s =s, of system (4.3) is readily found by use of (6.1) to be 


OA? oA? 
( dxtt? +--+ +4 
Ox? 


aAi 
— dx! bart? + + bx? = (i = + r), 


Ox? 


where the variables are characteristic (i.e., 7 =), the dx’s are given, and the 
0x’s are the unknowns. 

If dx+!~0, multiplying the second, third, - - - , (?—7)th columns of the 
matrix of (8.1) by dxrt®/dar+!, , respectively, and 
adding to the first reduces it to a column of zeros. Hence the rank of (8.1) 
is that of (6.4), namely, p—r—1. Thus the value of the character is 


(8.2) s=p-—r-1l, 


a formula which of course only applies to non-passive systems. Since the 
genus is at least equal to s, inequality (7.3) gives 


(8.3) = g, 

a result which we state as 
THEOREM 5. For any system of species one the genus and character are equal. 
Because of (8.3) equation (6.5) can be written 


(8.4) s=r-—r', 


* Goursat, p. 290. 


(8.1) 


1933] PFAFFIAN SYSTEMS OF SPECIES ONE 365 


9. Systems of species one whose first derived system has species not ex- 
ceeding one. If the first r’ equations of a system S written in reduced form 
(4.1) constitute its derived system S, the auxiliary systems (5.1) for S and 
5S! have in common the solution x*+', which is not an integral of S or S'. 

Conversely, if the auxiliary systems of S and S' have in common a solu- 
tion «+! which is not an integral, let a set of reduced variables containing 
x" be found for S'. By means of it S' is put in the form 


(9.1) St: dx! — Aldxt! = 0,---, dat — Avda = 0. 


Clearly x1, - - - , x” are integrals of the passive system obtained by adjoining 
dx't+=0 to S. Since any set of r+1 independent integrals of that passive 
system is a set of reduced variables for S, a set of reduced variables for S 
containing x!, - - - , x’, x*+ can be found, whereby the equations 2 which 
it is necessary to adjoin to S' in order to get S can be given the form 

(9.2) dat! — = 0,---, dat — = 0. 

The conditions that S' be the derived system of S can be written 
(9.3) = 0 (a = 1,2,---,r'). 


Substitution from (9.1) and (9.2) gives 


(9.4) = 0 (a = 1,2,---,r%), 
Consequently A!, - - - , A” are functions of x', - - - , x"! alone: 
(9.5) Al = Al(xl,---, = (xt, 


Because of the statement just preceding (6.3), relations (9.5) show that 
the number of independent variables in the set 


is the class p. From (6.5) the class is 27—r'+-1. Since this is the number of 
variables (9.6), those variables form an independent set. 

If S' is passive, any one of its reduced variables is an integral of S and 
therefore satisfies the auxiliary system for S. 

If S' is not passive and its class is not r'+2, by Theorems 3 and 4 any 
reduced variable for S! is a function of the x!, - - - , x", x*+1 in (9.1) and there- 
fore satisfies the auxiliary system of S. 

If S! is of class r'+2, one of the coefficients in (9.1) must involve one of 
the variables x"'#', - - - , x*. Suppose one involves x". A change of variable 
will make the coefficient in question x”. At the same time the form of the last 
equation of = can be preserved by using the equations of S to eliminate the 


| 
| 
| 
i 


366 J. M. THOMAS [April 


excessive differentials. Direct calculation then shows that x’ is a character- 
istic variable and consequently a reduced variable for S'; and from the 
form of (9.1), (9.2) x" is reduced for S. 

In every case, therefore, if a variable which is not an integral is reduced for 
both S' and S, every variable reduced for S' is reduced for S also. 

As a result of the developments in this section we have 


THEOREM 6. Given a system S of species one whose first derived system S* 
has species not exceeding one. Any reduced form of S‘ can be made the first r' 
equations of a reduced form of S if and only if the auxiliary systems of S and S* 
have in common a solution which is not an integral. 


Simultaneous reduction is not always possible. The system 
dx! — x*dx‘ = 0, 


dx? — x5dx* = 0, dx? — x®dx* — xtdx' = 0 


(9.7) 


is of species one. Its derived system is the first equation and is also of species 
one. But S and S' cannot be thrown simultaneously into reduced form be- 
cause their auxiliary systems have no solution in common. 

10. Nested systems of species one. Canonical form. If the auxiliary sys- 
tems (5.1) formed for a system and all its derived systems, except the last, 
have in common a solution which is not an integral, the system will be called 
a nested system of species one. All the derived systems, except the last, are 
systems of the same sort. 

By a canonical form of a Pfaffian system we mean an equivalent system in 
which the variables are all independent and each equation is in the canonical 
form for a single equation.* 

We next prove 


THEOREM 7. Every nested system of species one can be reduced to a canonical 
form having the following properties: 

(i) The system S is in reduced form. 

(ii) The first derived system S* is obtained by omitting the last r—r' equations 
of S, the second derived system S* by omitting the last r' —r? equations of S", etc. 

(iii) Every variable occurs at most once as a coefficient and at most once as a 
differential with the exception of the variable whose differential occurs in all the 
second terms. 

(iv) Any variable which occurs as a coefficient but not as a differential in 
S‘ occurs as a differential in S*-'. 

(v) The only variables which do not occur as coefficients are those whose dif- 


* See Goursat, p. 58, for a definition of the latter. 


1933] PFAFFIAN SYSTEMS OF SPECIES ONE 367 


ferentials occur in the last derived system and the variable whose differential oc- 
curs in all the second terms. 

(vi) The only variables whose differentials do not occur at all are the coeffi- 
cients in the equations which are in S but not in S'. 


It is of course clear that every derived system will be in a canonical form 
having the same properties. 

The theorem is immediate for r = 1. 

We now assume the theorem for r — 1 equations and proceed by induction. 

The given system S is not passive, but since we are to apply induction 
and one of the derived systems may be passive, it is necessary to consider the 
passive case and to remark that a passive system can obviously be thrown 
into a canonical form having all the enumerated properties. 

Since S is not passive, the number of equations in its derived system does 
not exceed r—1. The first derived system can therefore be supposed written 
in canonical form as the first r’ equations of S: 


dx! = 0,---,dx* = 0, — xtdxt! =0,---, 


10.1 
( dx” — = 0. 


Theorem 6 can be applied to show that the remaining equations of S can 
be simultaneously given the form 


(10.2) — Artiggrt! = 0,---, dat — Atdxtt! = 0. 


Let the variables which occur as coefficients but not as differentials in S* 
be denoted by 


Relations (9.5) give 

(10.4) = Fr'+1(1, grits! = Fr'+s'(x1, 

It must be possible to solve (10.4) for s! of the variables 

(10.5) 

for otherwise their elimination would lead to a relation among 
x", grits’ arth, 


that is, among the variables in terms of which S' is expressed. This would 
contradict S’’s being in canonical form. In particular, we must have 


(10.6) ssr-—?r, 


368 J. M. THOMAS [April 


Hence by means of (10.4) we can express the differentials of s' of the 
variables (10.5), which it is merely a matter of notation to suppose are 


(10.7) ar +1, 
as linear, homogeneous combinations of 
dx!,---, dx", dg tt, dart, 
By use of the equations of S' these become combinations of 


When these values have been substituted in (10.2), equations (10.2) can be 
put by solving into a similar form in which the differentials of the variables 
(10.7) are replaced by those of (10.3). So all x’s occurring as coefficients in 
S' occur as differentials in the new form of S. Moreover, the A’s in the new 
(10.2) when taken with the x’s form a set of independent variables because 
of the result about the variables (9.6). The induction is therefore complete. 

If the derived system is vacuous, the demonstration holds as given, but 
also follows immediately from the reduced form and the result about the 
variables (9.6). 

It is clear that any system written in the above canonical form is nested and 
of species one. ‘ 

The number s! appearing in above discussion and in (10.6) is the number 
of equations in S! but not in S?, for it gives the number of w’’s which vanish 
by virtue of S but not by virtue of S'. Hence we have 


(10.8) 


In the same way we denote by s‘ the number of variables which occur as 
coefficients but not as differentials in S‘. From (10.8) we have therefore 


(10.9) st = — itl, 


From (8.4) it is seen that the s’s are the first characters of the successive 
derived systems, s° being s. From (8.4) and (10.6) we get 


(10.10) s2si' 2s? 
We find it convenient to define a set of integers ¢ by means of the formulas 
(10.11) ti = st — sit, 


These #’s are the second differences of the numbers 7. Even for a nested system 
of species one they do not necessarily satisfy inequalities like (10.10), but 
they are non-negative because the s’s satisfy (10.10). 


- 
> 


1933] PFAFFIAN SYSTEMS OF SPECIES ONE 369 


Let / be the least number such that the derived system S' is passive or 
vacuous. Since 


r= 


we have 
(10.12) si=fi=0 G22). 


As a result of the foregoing, the numbers s, ¢ are all determined for a nested 
system of species one when the r’s are given. Conversely, if ¢, #,---, 
t'-! and r' are given, the s’s are determined by (10.11) and (10.12). The 
other r’s are then determined by (10.9). 

11. A nested system of species one as a sum of special systems. Consider a 
nested system of species one written in the canonical form of the preceding 
section. From it construct a set of systems in the following manner. Make 
w’ =0, which is the last equation in S, the last equation of one system. If x’, 
whose differential is the first term of w’, is not a coefficient in S', the new sys- 
tem will consist of the single equation w’=0 and will be called a system 7». 
If x* is the coefficient in an equation of S!, we take that equation, say w” =0, 
as the next to the last in the new system. If x*’ is not a coefficient in S?, we 
close the new system and call it a system 7). If x’ is a coefficient in S?, 
we take the corresponding equation, say w” =0, as the third from the last in 
the new system. And so on. If the system is closed with an equation from S‘, 
it will be called a system T;. It contains i+1 equations. 

We next take the equation w-! =0, if it is not in S', and form in the above 
manner the system T which it determines. And so on until all the equations 
in 2=S—S' are exhausted. When this stage is reached, all the equations of 
S, except those in its passive system, have been used. This is for the following 
reason. Because of the nature of the canonical form, the coefficient in an 
equation of S‘—5S**! occurs as a differential in an equation of S*'—S*; the 
coefficient in the latter occurs as a differential in S‘-?—S*'; and so on until 
S—S' is reached. 

The passive system is to be taken as a separate system. For it there is 
already the notation S'. 

The systems T are nested, of species one and in canonical form. In addi- 
tion, each derived system is obtained from the preceding by omitting its last equa- 
tion, and the class exceeds the number of equations by two. Consequently, 
each system T is special and of the type first reduced to canonical form by 
von Weber.* A passive system is also special. Therefore we have 


* See Goursat, pp. 321-8. 


370 J. M. THOMAS [April 


THEOREM 8. Every nested system of species one is equivalent to a set of special 
systems no two of which have an equation in common. 


Let ¢‘ be the number of special systems 7;. There is one and only one 
equation in S—.S' corresponding to each system T. Hence the total number 
of equations in S—S' is the same as the total number of systems 7, a fact 
which is stated by the equation 


In the same way, there are just as many equations in S'—5S? as there are 
systems T with index at least unity. Hence we get the sequence of relations 


From these and (10.9) it readily follows that the t’s are the numbers defined by 
(10.11). 

There is only one symbol common to two of the partial systems, namely, 
dx**', and it plays the same role throughout. It is clear, therefore, that a 
nested system of species one can be broken up by the above process into 2 
sum of special systems in only one way and that two nested systems of 
species one are equivalent if and only if their sets of special systems can be 
transformed one into the other. 

From the canonical form of a special system whose class is two more than 
the number of equations and which has no integrals it follows immediately 
that two such systems are equivalent if and only if they consist of the same number 
of equations.* Therefore two nested systems of species one are equivalent if 
and only if they have the same number of integrals and the same sequence of 
numbers ¢. From the results at the end of §10 this amounts to having the same 
number of integrals and the same sequence of first characters; or to having 
the same sequence of numbers r. 


THEOREM 9. A nested system of species one is completely characterized by a 
finite set of arithmetic invariants, which can be taken as the number of equations 
in the successive derived systems. 


If (r, r', -- - , r’) is defined as the symbol of a Pfaffian system, a nested 
system of species one is completely characterized by its symbol. 
The following question arises: what conditions must the non-negative 


* Goursat, p. 328. 


| 

| 


1933] PFAFFIAN SYSTEMS OF SPECIES ONE 371 


integers in (r, r', - - - , r') satisfy in order that it may be the symbol of a 
nested system of species one? In the first place, to have a meaning at all, 
it must not contain two equal r’s because of the definition of /. In the second 
place, the characters formed as the first differences of the numbers in the 
symbol must be non-increasing and non-negative in accordance with (10.10); 
and from the fact that no two r’s are equal they must therefore be positive. 
These two conditions are also sufficient. For the satisfaction of (10.10) in- 
sures that the ?’s will be non-negative. Corresponding to each ¢‘ we write ¢* 
special systems of +1 equations and class +3, each in canonical form with 
the dx*t+! in common and all the other variables distinct. To these we adjoin 
the differentials of r' other variables equated to zero. The result is a nested 
system of species one having the specified symbol. 


THEOREM 10. There is a nested system of species one having (r,r', - - - , r') 
as symbol if and only tf the first differences of the r’s form a non-increasing se- 
quence of positive integers. 

If the given system is special and of class r+2, the invariants have the 
following values: 


DvuKE UNIVERSITY, 
Duruay, N. C. 


| 
| 
| 
| 
| 
| 


FAMILIES OF GROUPS GENERATED BY TWO 
OPERATORS OF THE SAME ORDER* 


BY 
ABRAHAM SINKOV 


I. INTRODUCTION 


The present paper is an extension of a paper by W. E. Edingtonf 
entitled On an infinite system of non-abelian groups of order nm*—", in which 
it was shown that given any two numbers and m, there exists a group of 
order nm" generated by two operators of order m. Edington’s proof re- 
quires the assumptions that all the operators of the form S,*S.* are commu- 
tative and of the same order; it follows from the present treatment that the 
second is a consequence of the first and may be dispensed with. The results 
herein obtained exhibit a more general system of groups of order mm*"-*, 
where £ is an arbitrary factor of m, and obtain some properties of these groups 
not considered by Edington. 


II. A PROPERTY OF GROUPS G GENERATED BY TWO 
OPERATORS OF THE SAME ORDER 


To begin with, the generating operations S, and S: are assumed to be of 
the same order , so that S\"=5S."=1. As yet no further restrictions are sup- 
posed. A third relation which defines the order of a particular combination 
of S, and S: will be introduced later on, in order to show how it actually arises. 

Under these conditions, then, consider the totality of operators 


(a = 1,2,---,m—1). 
This set of operators 
( 
Ast 


* Presented to the Society, March 25, 1932; received by the editors September 12, 1932. This 
paper was written under the guidance of Professor F. E. Johnston. 
t Annals of Mathematics, vol. 25 (1923), p. 85. 


372 


CERTAIN FAMILIES OF GROUPS 373 


generates a sub-group H of G. The transform of the generator S,*S2"-* by Si 
can be written 


and is in H. When transformed by S; this same generator becomes 


a result which is again in H. Consequently, H is invariant in G. : 

Evidently, the adjunction of either S; or S; to H generates G. It follows | 
that the index of H under G cannot exceed ; it will equal m if and only if Z | 
involves no power of either S; or Sz. Such will be assumed to be the case. The | 
order of G is then equal to times the order of H. 

It is now possible to replace the operators S,«S:"-*, which generate H, 
by an equivalent set in the sense that this new set generates the same group, 
and possesses the added advantage that all of the new generators are of the 
same order. In fact, they are the complete set of conjugates of S,S."-' under 
Se: 


= (q = 0,1, 2,---,#—1). 
This new set will be denoted by é: 


(S-'S2)-! 


Observe that this set is obtainable from the preceding one by multiplying 
each operator of A in turn by the inverse of the one which precedes it, pro- 
vided that the identity is supposed to precede A, and to follow A,-. Con- 
versely A is obtainable from é because of the relation 


Tlé: = 
im1 
where £; denotes the ith term in the set £. In particular 
= 1. 


The preceding results may now be summarized as follows: 


| 

| 

| 

| 


374 


ABRAHAM SINKOV [April 


THEOREM I. The group G generated by two operators of the same order n 
contains an invariant sub-group H generated by n operators of the same order. 
The index of H under G is at most n. 


It is not amiss to remark at this point that the common order m of these 
new generators of H is quite arbitrary and may be assumed to be any integer 
whatever. 


III. DEFINING RELATIONS OF A 3-PARAMETER FAMILY 


It will now be assumed that the sub-group Z is abelian and that it is 
possible to select g of the operators which shall form a set of independent 
generators in the restricted sense. Its order will then be m*.* 

Suppose for a moment that ” is composite and contains the factor k. Then 
n=kx, and the set of operators — can be divided up in order into x sub-sets of 
k each. Suppose further that it is possible for the k operators of one of these 
sub-sets to be dependent upon the remaining generators, which form an inde- 
pendent set in the restricted sense. Then, under such circumstances the most 
general sub-group H which can be obtained, i.e., the one of greatest order, 
will be of order m*-*, By analogy with the condition previously obtained on 
the £;, the equation of connection will be chosen in the form 


= 1 (a=1,2,---,k;n = kx). 
i=0 


Expressed in words this means that the product of the operations which are 
in corresponding positions in each sub-set is the identity. 
Since 


= SYS 
z—1 
i=0 
= 


whence 
= 1. 


In the particular case k =1, this condition reduces to an identity. 
Conversely, if (S,S2*~')*=1, it follows from a reversal of the preceding 
manipulation that 


2-1 
= 1 (a = 1,2,---, k), 
t=0 


* Miller, Blichfeldt and Dickson, Theory and A pplications of Finite Groups, p. 90. 


1933] CERTAIN FAMILIES OF GROUPS 375 


and hence that & of the operators & are expressible in terms of the remaining 
ones. 

It is now necessary to examine this condition a bit more closely. Suppose 
it holds for some particular factor k of m. Then it can be shown that it is also 
satisfied by every factor of m which is a divisor of k. For, suppose k = rt. Then 
of the ré continued products 


= 1 (a = 1,2,---, rt), 


select the following r: 


=1 (a = 1,2,---,#). 
t=O 


Multiplying together all of the left hand members and rearranging the terms 
(which is permissible because of the commutativity of the &;), 


rz—l 


= 1 (a = 1,2,---,2). 


i=0 
The condition therefore holds for ¢. 

This leads to the conclusion that if for a given S, and S; two or more values 
k satisfy the relation (S,S2*—')* =1, and if it is possible to choose one among 
them such that all the others are divisors of it, then that one is the value to 
be used in determining the number of independent generators of H. 

Moreover, the assumption that the correct k requires the remaining 
(x—1)k operators to be independent prevents two numbers, where neither is 
a multiple of the other, from simultaneously satisfying the required condition. 
For, in that case, there arises the obvious contradiction that the order of H 
is given by two different powers of m. 

If the relation holds for no k>1, then k=1 and g=n—1. 

The number having been determined in this way, it follows that the 
order of H is m"-*. Since it has already been shown that the index of H under 
G in the most general case is m, then the corresponding order of G is nm"-*. 

There thus results the following theorem: 


THEOREM II. Two operators S; and S:2 of the same order n for which the set 
— is commutative generate a group G whose order is at most nm"—* where m is the 
common order of the operators — and the number k is defined as the greatest 
factor of n satisfying the relation 


(SiS#-!)? = 1 (kx =n). 


z—l { 
z—1 
[LE = 1, 
i=0 
| 
| 
| 


376 ABRAHAM SINKOV {April 


IV. GENERATING OPERATIONS OF G 


The maximum group thus defined exists for every number mm"-* and a 
pair of generating operations S; and Sz2, of a fairly simple form, can be set up 
for the general case. Let 


(@(m—1)n+-p+1 ** * a»). 


Then it can be shown that the operators S¢.S,"~* are all of order m and are 
commutative. Moreover, if k is the greatest common divisor of p and 2, it is 
found that »—k of the operators — form an independent set in the restricted 
sense. Hence there results the following theorem: 


THEOREM III. The two substitutions given above generate a group G of order 
nm"—* where k is the greatest common divisor of p and n. 


V. PROPERTIES OF THE GROUPS G 


The quotient group of H under G is cyclic and as a consequence every G 
is solvable. 

In order to determine the central it is first necessary to determine the sub- 
group of it which is contained in H, i.e., the combinations of the ¢; which are 


invariant in G. 
To obtain this result requires a knowledge of how the £; are transformed 
under S; and S:. First, consider the transform of £4: by S;. Since 
= SIS SP, 
Sr = (S'S2) P- 1) SP-4) 
In this last result, the two factors (S#-! S:)-! and (S;"—1S2) will be eliminated 
as a result of the commutativity of the S.S."-*. Hence 
Sr = (SF = &. 
Similarly 
Sr = 
and the £; are thus seen to be transformed in the same way by S; and S:. 
This property is also obtainable from the fact that S, and S: are contained in 


the same co-set of H. As a consequence of it, the operators of H which are 
invariant under S; are identical with those which are invariant in G. 


| 


1933] CERTAIN FAMILIES OF GROUPS 377 


It has already been seen that for a given k the operators ~ can be divided 
up into k different sets 


(a = 1,2,---, k) 


each containing x operators, and such that x—1 of the operators in each set 
are independent. Moreover, no operators in any one of these sets can be 
expressed in terms of any of the other sets. Hence, since &; and £;_; are in 
different sets if k>1, it follows that a combination of operators in any single 
set can be invariant in G, only if k=1. 
Suppose then that such is the case and that 

(a; < m) 
is invariant in G. (¢, may be omitted from this combination since it is ex- 
pressible in terms of the remaining (n—1) é’s.) This operator is equal to its 
transform under S; and consequently the following relation must hold: 


This leads to 


As a consequence of the relation 
= 1 
t=1 


the first quantity in parentheses is reducible at once to (£,1)~%. Hence 


a3—a,—a, 


This equality involves only »—1 of the é’s, all of which are independent of 
one another. Therefore both members must reduce to the identity, and as a 
result we have the following series of congruences: 


q = — 


a, = 2a; 


43 =a+ a4 


(mod m). 


| 
= On-2 + 


378 ABRAHAM SINKOV [April 


These yield 

dp = pay (mod m) 
and 
(C) na, = 0 (mod m). 


The invariant operator is now representable in the form 


n—1. a, 


a, being a root of the congruence C. Every combination of the generators 
&, to £,.. which is invariant in G must take this form. Conversely every such 
operator is invariant in G and the above representation is necessary and suffi- 
cient. 

If we were to consider a combination involving all the £’s but &, the rea- 
soning would be identical with the above and it would be found that an in- 
variant operator is necessarily of the form 


n—a n—a+l1 n—2 n—1, a; 


where again a; satisfies the congruence 
na, = (mod m). 
The above operator may be written 


n—a—1 n—a n—2. a, 


and by virtue of the relation 


=1 


t=1 


becomes 
But 
— a, = (n — (mod m), 
so that 


a, 


= 


n 
= 


1933] CERTAIN FAMILIES OF GROUPS 379 


This shows that all the operators 


n—a_n—a+l n—1, a, 


are identical, and that one may with perfect generality consider 


2 n—1 
as being the only permissible combination. As a result, the invariant opera- 
tors of G contained in H form a cyclic sub-group whose order is determined as 
soon as the possible values of a; <m are known. 
To find these values it is necessary to return to the congruence 


na, = 0 (mod m). 


If m is prime to 1, a,=0 is the only solution which is less than m. In such a 
case the identity is the only invariant operator. If m and m have the greatest 
common divisor d, the congruence reduces to 


n m 
—a,=0 (moa *). 
d d 


Here a; may take on all the values mg/d (q=1, 2, - - - , d) and the sub-group 
is of order d. Both of these cases are combined in the general result that the 
number of invariant operators in H is d, where d is the greatest common 
divisor of m and n; they are all expressible in the form 
n—1.maq/d 
The generalization to any value of & follows along the same lines. Suppose 
some combination of é, to £,_; to be invariant. For convenience, it is set down 
in the form 


where the operators of each set are kept together. If it is equal to its trans- 
form, 
“1 2 _ 2 


a, 1b, 1 2 


from which 


ki~i1, he 


380 ABRAHAM SINKOV [April 


Treating the first parenthesis as before and simplifying, 


The above relation involves only n—k& of the é’s and x—1 of each set. Hence 
these n —k are all independent and both members reduce to the identity. As 
a result a series of congruences is obtained: 


Sh, 
== = he, 
2=kh+a, 
a3 = ke + 
= tay (mod m). 


These lead to the relations 


ae = 
a3 = 
(mod m), 
so that finally 
xa, = 0 (mod m). 


From this point on, the reasoning is exactly like the preceding except that 
the quantity ” is now replaced by x =n/k. Hence the sub-group of the central 
which is contained in H is cyclic and of order D where D is the greatest com- 
mon divisor of m and n/k. The final form of the invariant operators is 


and these operators will henceforth be denoted by h;. 
The question now arises “Are there any invariant operators in G which 
are not contained in H?” It will first be shown that if such operators exist 


(q = 1,2,---,D) 


1933] CERTAIN FAMILIES OF GROUPS 381 


they must be of the form S,°/; where S,’ is invariant in G. For, every operator 
outside of H is of the form 


where 3 is an operator in H. If such an operator is invariant in G it is in- 
variant under S;, which transforms it into 


and hence 
KH = 
Consequently 
KR = h;. 


If now the transform of S,°h; under S: is considered it follows that 
= Si’. 


Hence the only additional invariant operators which need be sought are 
powers of S;. If no power of S; is invariant under S:, the central will be 
wholly contained in H and will be identical with the cyclic sub-group 
already found. 

The next step then is to investigate when a relation such as 


Sr = Sy 
is satisfied. If it holds, so will 
SLPS = Sy’, 


from which modified form it is possible to deduce a first necessary condition. 
For it implies 


SPS; = = 


In terms of the é’s, this becomes 


Tle = 
t=—1 
i=n—v+1 t=_1 
and therefore 
n—v=2, 


whence 


| 


382 ABRAHAM SINKOV [April 


That is, the only power of S, which can be invariant in G is S*/?. 
If S?#/? is invariant, then so is S#/2S,"/? also. For it is in H and is trans- 
formed into itself by S:. But 


n/2 
= 
i=1 
and is invariant only when x = 2; m=D. The latter of these relations requires 
m to be a divisor of n/k; the former requires that k=n/2. 
These necessary conditions are also sufficient, and the resulting theorem is 
THEOREM IV. The central of G is cyclic and of order D where D is the 
greatest divisor of m and n/k except in the special case 


n 
k=—;m=2; 
2 


in this latter case the order of the central is 2D. 


The quotient group of H under G is cyclic and therefore abelian, hence H 
contains the commutator sub-group of G. To determine this sub-group, con- 
sider first the case k =1, and the following set of operators in H: 


+, 
Each of these operators is a commutator; for 
= 


Again, the first »—2 of them are at once seen to be independent since they 
involve only n—1 of the é’s. The group generated by them is therefore of 
order m"~*. If the operator £,£,-! or any of its powers were in this group, they 
would be obtainable from the continued product of the first m —2 generators; 
for, it has already been seen that the relation between the é’s involved all of 
them in a continued product. Now 


Suppose 
= 


Then 


= 


n 
2 


1933] CERTAIN FAMILIES OF GROUPS 


from which 


(mod m) 


and 
nB =0 (mod m). 


This last result is identical with the congruence obtained in the study of the 
central. The smallest value of 8 which satisfies it is m/d where d is the greatest 
common divisor of m and n. Hence £:£,-! is contained in the sub-group under 
consideration if and only if d=m. Omitting that case for the time being, it 
follows that the »—1 operators ££, are independent and generate a group 
K of index d under H. 

It will now he shown that this group K is the commutator sub-group, by 
demonstrating that every commutator is contained in it. However it is first 
necessary to obtain some preliminary results. 

Suppose f(é) represents any combination of the £;, which by virtue of 
commutativity may be written in the form 


eee 
Denote by f-.)(€) the combination 


is in K. For this product contains pairs of factors, 


ar. 


which may be written 


Consider now a general commutator of G. Every operator of G is expres- 
sible as some operator in H multiplied by a power of 5; so that the most 
general commutator is 


It has already been seen that 


= 


383 

a=-8B 
Then 


384 ABRAHAM SINKOV (April 


from which 
Sy ES) = 


also 


Sif, Si = 
Siti Si” = 
The above results show that the transform of any combination of é’s by 
Sy reduces every original subscript by a. That is, it changes f(€) into f,_.)(é). 
The quantity 
A 


is therefore equal to 


According to the preliminary lemma, this is contained in K and therefore K 
is the commutator sub-group. 

Returning now to the case d=™m, a group of index d under H would be 
generated by n—2 operators, thus explaining why £1 is in this case con- 
tained in the group generated by the other » —2 of the operators &é>1. 

This remark now makes the result perfectly general for the case k=1. The 
commutator sub-group for every group of order nm"~' is of index d under H. 
In particular, when d=1, i.e., when G contains no other invariant operator 
than the identity, the commutator sub-group coincides with H. 

The generalization to any value of & offers no essential difficulty. The 
reasoning follows along the same lines and in the final result the only change 
is the replacement of d by D where D is the greatest common divisor of m 
and n/k. 


THEOREM V. The commutator sub-group of G is contained in H and is of 
index D where D is the greatest common divisor of m and n/k. It is generated by 
the n—1 operators 


VI. SPECIAL CASES 


The following special cases seem worthy of mention. 
In the simplest case, m = 1 and G is cyclic. S;=S: and the relation 


(SiS#-")? = 1 


reduces to an identity. 


i 
(a = 2,3,+++,m). 


1933] CERTAIN FAMILIES OF GROUPS 385 


The family of groups G includes the dihedral groups as the special case 
n=2. 

In the special case »=3, k=1, G is the group of order 3m?, previously 
studied by Edington* in his thesis and also by Miller.* 

In the special case n=4, k=2, G is of order 4m? and is another of the 
families obtained by Edington in his thesis. He defined the group by means 
of the relations 


= S24 (SiS2)? =1, 


The condition (S; S:)?=1 is exactly what (S,S2'—!)* =1 reduces to on setting 
x =2. It is interesting to note in connection with this family of groups that it 
is not necessary to assume the operators S¢S:"-* commutative. In this par- 
ticular case that property follows as a consequence of the defining relations. 

Finally, Edington’s groups of order »m*™~! mentioned in the introduction 
are simply isomorphic with the groups obtained on setting k=1. 

* W. E. Edington, these Transactions, vol. 25 (1923), p. 193. G. A. Miller, Proceedings of the 
National Academy of Sciences, vol. 13 (1927), p. 24. 


GEORGE WASHINGTON UNIVERSITY, 
WasainctTon, D. C. 


| 


GROUPS {S, T} WHOSE COMMUTATOR SUBGROUPS 
ARE ABELIAN* 


BY 
H. R. BRAHANA 


The groups generated by S and T satisfying the relations S*=T?=(ST)* 
=1 were classified by Professor Miller.t The fact that makes these groups 
particularly easy to manage is that the commutator subgroups are abelian. 
It has been notedf that, with the exception of the tetrahedral group and the 
dihedral group of order 6, the only groups generated by an operator S of 
order 3 and an operator T of order 2 whose commutator subgroups are abelian 
are those considered by Miller. The groups which we shall consider are 
generated by S and T which satisfy the relations 


where / is a prime, and have abelian commutator subgroups. The groups 
generated by two operators of order two are the well known dihedral groups. 
In view of this fact and of Miller’s paper we may assume that # is a prime 


greater than 3. 
1. Using the notation we have used before§ we let ¢;= 7S-‘TS‘, i=1, 2, 


- ++, p—1. We note that T transforms a; into its inverse and that S trans- 
forms a; into o7'¢;41. The group generated by the o’s is therefore invariant; 
it is contained in the commutator subgroup H, and we shall show that it 
coincides with H. 

Any operator of {S, 7} may be written in one of the forms 


(1.1) 
(1.2) oT, 
(1.3) TS*, 
(1.4) a-Si, 


where is an operator of the group {o1, 02, - - - ,¢p-1}. For, multiplication on 
the right by S puts (1.1) and (1.2) into (1.4) and (1.3) respectively, and either 
leaves the latter two in their present forms or puts one or both of them into 


* Presented to the Society, August 31, 1932; received by the editors July 28, 1932. 

T Quarterly Journal, vol. 33 (1901), pp. 76-79. 

t American Journal of Mathematics, vol. 50 (1928), p. 347. The two groups of orders 6 and 12 
should be excepted in that paper too. 

§ Cf. the last reference. 


386 


= | 


GROUPS WITH ABELIAN COMMUTATOR SUBGROUPS 387 


(1.2) and (1.1) respectively. Multiplication on the right by T interchanges the 
first two and interchanges the last two, for TS‘T =o0TS‘TS-*. S‘=ao;-S‘ 
and oS'T - TS‘ =o0_7'-TS*. 

Now let Q=oaT%S% and R=o’- TS, where o and a’ are two operators 
of {o1, 2, - , @o and bp are either or both 0 or 1, and a and are 
integers less than p, be any two operators of {S, T}. Their commutator 
Q-"R-OR may be reduced as follows: 


> =Q-'R"0R 


Now if a) and by are both zero, the right side of (1.5) iso. 


If a9 =bo=1, (1.5) becomes 
If a9 =0, =1, it becomes 4,0". 
If a9=1, bo =0, it becomes o,71 


So in any case = is an operator of {o1, 02, - - - , op}. 

In the above we have not assumed that H was abelian nor that the order 
of S was prime. Hence, 
(1.6) In any group {S, T} the commutator subgroup is generated by commu- 
tators of T and powers of S. 


2. We now assume that H is abelian and that the order of S is a prime 
number #. Since To;T =o7" it follows that T is in H only if H is of order 
2™ and type 1, 1, - - - . Conversely, if H is abelian, of order 2", and type 
1,1, - - - , J is permutable with every operator of H, in which case T may or 
may not be in H as we shall see in a later section. We shall suppose hereafter, 
except where the contrary is explicitly stated, that T is not in H. 

If S is in H the group {S, T} is generalized dihedral, being generated by 
an abelian group H and an operator T which transforms every operator of H 
into its inverse. In fact {S, 7} must be the dihedral group of order 2p. These 
two types of group are special and admit of special treatment. We shall then 
assume that neither S nor T is in H. 

The quotient group of {S, T} with respect to H is necessarily abelian. 
It is generated by two operators of orders p and 2 corresponding to S and T 
and therefore it must be cyclic and of order 2p. The operator (ST)? written 


i 

i 

i 

i 

4 

| 

iq 


388 


in terms of commutators” is 


(ST) P = * * p-10 p-20p-3°** 


Since H is abelian this is identity. Hence, 
(2.1) If the abelian commutator subgroup contains neither S nor T, then (ST) 
is of order 2p. 

3. We proceed to an examination of the p—1 o’s which generate H. We 
note first a relation which has been used before: 


= S1-TSTS'-S = S“TST-TS-@Y 


= oT 


(3.1) 


In the same way we may obtain 
(3.2) S-*oS* = of! - 


where i and k may take on all the values 0, 1, 2,---, p—1, and 7+ is 
reduced modulo 

If in (3.2) we take 7 to be 1 and allow & to take on the values 1, 2,-- -, 
p—1 we get the set of p conjugates of o; under S. This set generates a group 
which contains o; for every 7. Hence we have 
(3.3) The group H is generated by the set of conjugates of 0, under S. 

The p—1 o’s generate H, (1.6), and consequently H can have no more 
than p—1 independent generators. There must then be a relation connecting 
the p conjugates of o; under S. There may be a relation connecting a smaller 
number of these conjugates. If there is a relation connecting the first m+1 
conjugates under S, viz. 01, , Om 'Om41, then om4: is in the 
group generated by the first m o’s. By (3.1) om42 can be expressed in terms of 
the preceding o’s and hence in terms of the first m o’s. Therefore, 

(3.4) If m+1 is the smallest number of successive conjugates of o, under S 
so that the last one may be expressed in terms of the preceding ones, then H has 
m independent generators. 


The # conjugates of o; which by (3.3) generate H are of course of the same 
order, the order of o:; we shall show also that the —1 o’s are of the same 


order. 
Let the order of o; be m;. If in (3.2) we let i= p—k we have 


(3.5) = = of". 


This implies that m,=m,_x. If in (3.1) we let t=1, we have S—e,"S 


* Cf. the reference given in the third footnote in this paper. 


= H. R. BRAHANA [April 
| 


1933] GROUPS WITH ABELIAN COMMUTATOR SUBGROUPS 389 


=o;7"02.". When =m, this gives o2%1=1, and therefore m2 is a divisor of m. 
Similarly, allowing 7 in (3.1) to take on successively the values 2, 3, - - - it 
follows that ; is a divisor of m. 

If in (3.2) we let k=2, and let i take on successively the values 2, 4, 6, 

- , p—1, we see in the same manner as above that m4, m6, - - - , Mp1 are 
divisors of m2. By (3.5) we see that 2,_1=m, and since m2 divides m, and np-1 
=m, divides ne, it follows that =MNp_2. 

If in (3.2) we let k =3, and allow i to take on the values 3, 6, - - - , we find 
that 1, m9, - - - are divisors of m3. One of the numbers p—1 and p-—2 is 
divisible by 3 and therefore 2; =”,_1=Mp_2 divides m3, and m3=m. 

This induction may be completed by showing in the same manner that if 
n;=m, fori=1,2,---,k, then Hence 
(3.6) If the order of S is a prime and H is abelian, the o’s are all of the same 
order. 

4. In order to investigate the group H it is convenient to consider the 
orders of the operators in the co-set HS. The order of o;S may be obtained as 
follows: 


(oS)? = oS 


4.1 


By application of (3.2) this becomes 


(oS)? = S?- p-10i4p-1° 07 
The o’s with negative exponents are 01, o2, - - - , ¢p-1. Those with positive 


exponents are p in number including ¢,;,,_;=0,=1. The remaining p—1 are 
01, 02, Hence we have 


(4.2) (oS)? = 1. 
If we take the pth power of any operator of HS we have 


This is obtained by moving o,~ past successive S’s by means of (3.1) in the 
form o;S=S-o71¢:4:. By repeated application of (4.3) we reduce the pth 
power of any element of HS to (4.2) and hence obtain 
(4.4) Every operator in the co-set HS is of order p. 

From this theorem we may draw many important conclusions concerning 
H and S. We note first 


(4.5) Any operator of H whichis permutable with S is of order p. 


i} 

it 
i 

i 
i 
if 
yay 
Al 
Ay 
| 
| 

= ut 
iat 


390 H. R. BRAHANA [April 


From (4.5) and the fact that T transforms every operator of H into its 
inverse follows 
(4.6) The central of {S, T} is identity. 

If H contains operators of order p then {:} contains operators of order p. 
If an operator of order p in {o;} is not invariant under S the cyclic group 
generated by it is not invariant and contains no invariant operator except 
identity. Its conjugates generate a group of order p™, m <p—1, and type 1, 1, 

- ++, This group contains 1++ - - - subgroups and at least 
one of them is invariant under S. 

5. H is the direct product of its Sylow subgroups, and each of its Sylow 
subgroups is invariant under S. Let the order of H be p*p,1po»pys---, 
where the p’s are distinct primes. Then by (4.5) none of the Sylow subgroups 
corresponding to #;,i=1, 2, 3, - - - , can contain invariant operators. Hence 


(5.1) If the order of H is p*pi ps2 - - +, where the p’s are distinct primes, each 
of the numbers pis congruent to 1, mod p. 


If the residue of p;, mod #, belongs to the exponent »—1, then the number 
a; must be at least p—1. The Sylow subgroup H,, of order #; contains a 
characteristic subgroup composed of operators of order ~; contained in sub- 
groups generated by operators of highest order in H,,. When #, belongs to 
the exponent »—1 the order of this characteristic subgroup must be 7,?-, 
and therefore H,, must contain ~— 1 independent generators of highest order. 

If a; is used to denote the exponent to which #; belongs, mod , then a; is 
a divisor of p—1. Replacing the exponent p—1 in the preceding paragraph by 
a; we have the result that the number of independent generators of highest 
order of H,, is a multiple of a;. Continuing the argument we observe that 
H,, contains a characteristic subgroup generated by operators of order p; 
contained in cyclic subgroups of next to highest order in H,;. The order of 
this subgroup must also be a multiple of p:, and the number of independent 
generators of H,, of next to highest order must be a multiple of a;. This argu- 
ment may be continued to give 


(5.2) The number of independent generators of H,; of each order for which there 
is one is a multiple of a;, the exponent to which p; belongs modulo p. 


6. The Sylow subgroup H,, of H is invariant under H, S, and T, and is 
therefore an invariant subgroup of {.S, T}. The corresponding quotient group 
of {5,7} is generated by two operators of orders p and 2, and its commutator 
subgroup is abelian, being the quotient group of H with respect to Hp,. 
Moreover, any product of the H,,’s is invariant in {S, T} and the cor- 
responding quotient group is of the same type as {S, T}. Hence, 


| 


1933] GROUPS WITH ABELIAN COMMUTATOR SUBGROUPS 391 


(6.1) The existence of the group {S, T} whose commutator subgroup is abelian 
and of order p*pp.% - - - implies the existence of a group {S', T'} whose com- 
mutator subgroup is abelian and of order p+ for each i. 


From this theorem it follows that the question of the existence of a group 
{S, T} with a given abelian group H as a commutator subgroup must be con- 
cerned with the existence of groups {S, T} whose commutator subgroups are 
Sylow subgroups of H. We shall show that the existence of {5, T} follows 
from the existence of each of these groups whose commutator subgroups are 
prime power groups, thus proving the converse of (6.1). 

Let {S’, T’} and {S’’, T’’} be two groups whose abelian commutator 
subgroups H’ and H”’ are of orders p:% and p.% respectively, where p; and 2 
are distinct primes, let S’ and S’’ be of the same prime order #, and let o;’ 
and o;’’ be commutators of the respective pairs of generators. Now let H be 
the direct product of H’ and H’’, and let T be an operator of order 2 which 
transforms every operator of H into its inverse. The group {H, T} is general- 
ized dihedral and its group of isomorphisms is abstractly the same as the 
holomorph of H.* The group of isomorphisms of H is the direct product of the 
groups of isomorphisms of H’ and H” and hence the holomorph of H con- 
tains operators of order p. We wish to show that there is one such operator S 
which with T will generate a group having H as a commutator subgroup. Let 
S be an operator which transforms H’ and H”’ as they are transformed by 
S’ and S”’ respectively, and let S“TS =To1'o;'’. Then 


| 
= where Im = onion 
= 


= cee 


If k= p, we have 


Taking account of the definition of =, and of (4.1) and (4.2), we have 
S-*TS? =T. Since S transforms {H, T} according to an operator of order p 
we may take S to be of order pt. The group {S, T} contains o;'0;’’, and since 
the orders of a,’ and o,”’ are relatively prime, contains both o;’ and o,’’. The 


* Miller, Blichfeldt, and Dickson, Finite Groups, p. 169. 
t American Journal of Mathematics, vol. 52 (1930), p. 919. 


| 
| 
q 
= Toi 01 + 
“| 
| 


392 H. R. BRAHANA [April 


group generated by the conjugates of o,'0,’’ under S contains the conjugates of 
o;’ and o;"’ under S’ and S”’ respectively and so is H, which by (1.6) is the 
commutator subgroup of {S, T}. Hence we have 


(6.2) The existence of two groups {S', T'} and {S'’, T’’} whose abelian com- 
mutator subgroups are of orders relatively prime and with S’ and S"' of the same 
prime order p, implies the existence of a group {S, T} with S of order p and a 
commutator subgroup which is the direct product of the commutator subgroups of 
the two given groups. 


7. We consider next the subgroups {.S’, T’} which have abelian commu- 
tator subgroups and are generated by an operator of order p and one of order 
2. Let S’ and T’ =o;"T. Then 


of = T'S'“T'S! = To, -1S 


2! 
= of TS 


of TSTS - of oF 


(7.1) 


= 
Let the order of a; be m. When m is odd k may be chosen to be m and then / 
may be chosen so that 1—2/ is any number less than or equal to m. Then 
(7.1) becomes o;’=0,"’, where m’ is any number; hence | may be chosen so 
that m’ is any divisor of m. If m is even then for any choice of k the number 
2k —21+1 is still odd and for proper choice of & and / will be any odd divisor 
of m; or k may be chosen as any even number and then / may be chosen so 
that 2k —2/+-1 is any odd number less than m. If in the latter case k is taken 
to be the highest power of 2 contained in mand / chosen so that 2k —2/+1 is 
the largest odd divisor of m, then o,’ will be of even order. Hence, 


(7.2) If the order of H is pe2»per---, then {S, T} contains subgroups 
{S’, T’} whose commutator subgroups H’ have the orders -, 
where the k’s are zeros or ones independently. 


8. In the preceding pages we have determined various conditions on H 
which are necessary in order that H be the abelian commutator subgroup of 
{S, T}. The last conditions are conditions on the order of H. We wish to in- 
quire what conditions on H may be sufficient to determine H so that a group 
{S, T} necessarily exists containing H as a commutator subgroup. 

We proceed to prove the following fundamental theorem: 


(8.1) Necessary and sufficient conditions that there exist a group {S, T} 
generated by S of order p and T of order 2 and containing a given abelian group 


' 
‘ 

| 


1933] GROUPS WITH ABELIAN COMMUTATOR SUBGROUPS 393 


H as a commutator subgroup are (1) that the group of isomorphisms of H contain 
an operator U of order p whose powers transform an operator s, of H into a set 
of generators, and (2) that the product of the operators in the set of conjugates 
under U which contains s, be identity. 


The conditions are necessary because of (3.3) and (4.2). Conversely, the 
group of isomorphisms of H contains an operator of order 2 which transforms 
every operator of H into its inverse; let T be an operator of order 2 which per- 
forms this transformation. Let S be an operator which transforms H accord- 
ing to U and which transforms T into Ts,. The order of S has not yet been 
determined, but if its pth power transforms T into itself it will be possible to 
require further that S be of order p*. We therefore determine S~?7'S?. 


S-*TS* = = TS 5, 
If we denote the successive conjugates of s; under U as follows: 
U-*s,U* = Sk+1, 


we have 

S-?TS? = Ts + + Sp-iSp- 
If sis2 - - - Sp-1Sp=1, then S? is permutable with T and S may be taken to be 
of order ~, which completes the proof of the converse. 

Since the two conditions above are sufficient to ensure the existence of 
{S, T} it follows that the conditions stated in the preceding theorems hold; 
in particular (4.5) holds. The question arises as to whether or not (2) can be 
replaced by (4.5). The product of the set of conjugates of s; is of course in- 
variant under U. If (4.5) holds and the order of H is g*, where q is prime to ?, 
this product must be identity. Hence, 


(8.2) If H is abelian and of order q", where q is prime to p, then condition (1) 


of (8.1) and (4.5) are necessary and sufficient that there exist a group {S, T} 
having H as a commutator subgroup. 


If the order of Z is 9, it is obvious that condition (2) of (8.1) may not be 
replaced by (4.5), for the group of order p” and type 1, 1, - - - admits an auto- 
morphism U, 

U = Si+1, 
such that s; and U satisfy the given conditions. But since H has p independent 


generators it cannot be the commutator subgroup of any group {S, T}. The 
groups H satisfying (1) of (8.1), (4.5), and having not more than p—1 inde- 


* Cf. the second reference in §6. 


; 
| 


394 H. R. BRAHANA [April 


pendent generators have been determined in another paper* and the groups 
in which we are interested will be found among them. They are of the two 
following categories: 

(a) H of order pi™+4(™-) with k, independent generators of order p™ 
and ke of order where ki +k2=p—1 and m21; 

(b) H of order p*+? and type 2, 1, 1, ---, where k<p—2. 


In the case of the groups of the first category it is always possible to select 
U so that conditions (1) and (2) of (8.1) hold. Let us suppose that k; = p—1, so 
that and His of type m, m, - - - . Let 51, Se, , Sp-1 be a set of inde- 
pendent generators of H, all necessarily of order p”. Then the group of iso- 
morphisms of H contains an operator U defined as follows: 

= 5s: i= 
(8.3) = = 1, 2, 
U-'s,_1U = Spt. 
It is obvious that the order of U is p, and the product of the set of conjugates 
of s; under U is identity. 

The existence of a group {S, 7} for the H just considered follows from 
(8.1). H contains a subgroup H’ of order p which is invariant in{S, T}. The 
quotient group of {S, T} with respect to H’ is, by an argument similar to 
that used to establish (6.1), of the kind we are considering and its commu- 
tator subgroup is the quotient group of H with respect to H’; it is of type 
m,m,---,m, m—1. By taking successive quotient groups with respect to 
invariant subgroups of order p, we may obtain a group {.S, 7} having any 
one of the groups of category (a) as a commutator subgroup. 

The groups of category (b) do not admit of isomorphisms which with H 
satisfy conditions (1) and (2) of (8.1). Let H be of type 2, 1, 1, - - - , and let 
1, an operator of H, be of order p*. Then there exists an operator U of order 
p in the group of isomorphisms of H which transforms s; into a set of gen- 
erators.t| Now U-'s,U =s,s,., where s, is some operator of H. Because of the 
type of H, s,” must be invariant under U and hence s, is of order and may 
be taken to be se, one of a set of independent generators. We may then sup- 
pose the generators of H to be chosen so that 

U-15,U = sy5e, 
(8.4) = Siti, 2,3,--- 9 


a, 


* Prime-power abelian groups generated by a set of conjugates under a special automorphism, to be 
published in the American Journal of Mathematics. 
t Ibid., (5.64). 


1933] GROUPS WITH ABELIAN COMMUTATOR SUBGROUPS 395 


As was shown* in the paper referred to, the numbers de, a3, - + - , @41 are 
completely determined by k. In considering successive conjugates of s; under 
U we need only consider the powers of s; in these conjugates, as will be ap- 
parent at the conclusion. Each of the first k+1 conjugates contains s; to the 
first power. Successive ones thereafter have s; to the powers 


1 + ap, 


eo 


and so on, where r=~—1— and the numbers in the brackets are the bi- 
nomial coefficients. The last one in the series (8.5) is 


The sum of these numbers is readily obtained. We note first that 


Then the coefficient of ap in the sum will be 


1 ) 2 r—2) 
which is (1—1)"-?=0. Hence the sum is simply the sum of the » numbers 
independent of af, all of which are 1’s. Therefore the product of the ~ con- 
jugates in the set which contains s;, contains s,”, and cannot be identityt 
since the s,’s are independent generators. Since any operator U of order p 
which transforms an operator s, of H into a set of generators can be written 
in the form (8.4), and since the product of the set of conjugates of s; is not 
identity, it follows that no group H of the second category is the commutator 
subgroup of a group {.S, 7}. We may then state the following theorem: 


(8.6) In order that there exist a group {S,T} having a given abelian group H of 
order p" as a commutator subgroup it is necessary and sufficient that (1) n=kim 
+he(m—1), (2) kitke=p—1, and (3) H have k; and ke independent generators 
of orders p™ and p"—" respectively. 


* Tbid., (5.26). 
{ The product is exactly s,”, but it is not necessary to prove it here. 


if 


| 
i 
i 
| 
(8.5) 
{ 
i 
| 
4 
i 
“a 


396 H. R. BRAHANA 


The corresponding theorem for the case where the order of H is q”", 
gq prime to P, is obtained from (8.2) and (3.2) of the paper referred to above. 


(8.7) In order that there exist a group {S,T} having a given abelian group H of 
order q” as a commutator subgroup, it is necessary and sufficient that (1) n= 
a(kym,+kheme+ --- +hym;), where a is the exponent to which q belongs, 
mod p, (2) +k; S(p—1)/a, and (3) H have kia independent 
generators of order q™. 

On the basis of a knowledge of the elementary part of the theory of abelian 


groups, theorems (6.2), (8.6), and (8.7) give a complete solution of the prob- 
lem of the determination of the groups designated in the introduction. 


TJNIVERSITY OF ILLINOIS, 
Urpana, ILL. 


ON THE REPRESENTATION OF A POLYNOMIAL 
IN A GALOIS FIELD AS THE SUM OF AN 
EVEN NUMBER OF SQUARES* 


BY 
LEONARD CARLITZt 


1. Introduction. Let GF(p") denote a fixed Galois field of order p*, p 
being any odd prime, and m an arbitrary positive integer; let D(x, p") denote 
the totality of polynomials in an indeterminate, x, with coefficients in GF(p"). 
In this paper we seek simple expressions for the number of representations of 
a polynomial in D as a sum of squares of polynomials in D that satisfy certain 
restrictions. 

More precisely, suppose that F is a primary polynomial, that is, the co- 
efficient of the highest power of x occurring in F is the 1 element of the Galois 
field. Let s be a positive integer; ai, - - - , a, Bi, - - - , Bs, 2s elements of the 
Galois field such that 

Yi=a+ #0 


Then 
(A) If F is of even degree, 2k, and 


9, 
we seek the number of solutions of 
(1) = + + ries + + 


in primary polynomials X;, Y;, each of degree k. 
(B) If F is of arbitrary degree f, 2k is any even integer >f, a any non-zero 
element of the Galois field, and 


we seek the number of solutions of 
(2) aF = aX? + BiY? + em + aX? + B.Y? 


in primary polynomials X;, Y;, each of degree k. 
The solution of (A) is expressed in terms of one of the functions p,(F), 
w,(F), defined thus: 


* Presented to the Society, August 30, 1932; received by the editors June 23, 1932. 
¢ This paper was written when the author was an International Research Fellow at Cambridge 
University. 


397 


} 
if 
i 
i 
| 
te 
fi 
Lad 


398 LEONARD CARLITZ 


1 m>k m=k 
(3) pif) = (1-—) >>| | =p, 


mir M\F 
where m denotes the degree of M: the first summation is over all primary 
M dividing F, and of degree >, the second is over all primary M dividing 
F and of degree =k; 

1 m>k m=k 


M\F M\F 

the summations having the same meaning as in (3). If now 
(S) (— 1)*a, -- aBi--- Be 
is a square in GF(p"), then the number of solutions of (1) is p.:(F); if the 
expression (5) is a non-square of GF(p"), then the number of solutions is 
w,-1(F). 

Case (B) involves a modification of the p and w functions: if (5) is a square, 
the number of solutions of (2) is 


ok (F), 


where 


1 k 
6 
(6) =) 


if (5) is a non-square, the number of solutions is 


(FP), 


where 
1 m>f—k m={—k 
(7) (— 1wk = (1 + =) 
mir 
If k>f, the second sum in (6) and (7) is vacuous and denotes zero. 

We first treat case (A); then, making use of the results for this case, it is 
easy to deduce the results (6) and (7) for case (B). The method used is quite 
elementary, and presupposes only some well known general theorems con- 
cerning Galois fields.* 

It should be emphasized that the results of this paper hold for all positive 
s. This is rather surprising when comparison is made with the known results 
concerning the number of representations of an ordinary integer as the sum 
of 2s squares: in the latter problem, while the cases 2s =2, 4, 6, 8 admit of 


* These theorems will be found in Dickson’s Linear Groups, 1901, pp. 3-54. 


[April 
| mls; 
M\F 


1933] REPRESENTATIONS OF POLYNOMIALS 399 


simple expressions in terms of divisor functions, this is no longer true for 
2s >8. While comparison of the problem of this paper with the ordinary 
problem is of some interest, actually, since we are considering representations 
in terms of primary polynomials, the analogy is closer with the question of - 
the number of representations of an integer as the sum of squares of positive 
integers. 

Finally, we remark that it is possible, by methods similar to those used 
here, to determine the number of representations of a polynomial by means 
of any odd number of squares (that satisfy certain conditions). As we shall 
show in another paper, the final formulas are of quite a different type; they 
are no longer functions of divisors but involve sums of quadratic characters. 

2. Notation; preliminary lemmas.* We shall employ the following nota- 
tion. Polynomials will be denoted by large italic letters; unless the con- 
trary be explicitly stated, a polynomial will always be assumed primary. 
Ordinary integers will be denoted by small italic letters, elements of the 
Galois field by small Greek letters. The degree of a polynomial will be denoted 
by the corresponding small letter, and we shall write 


f =degF, |F| = pv. 
(A, B) is the “greatest” common divisor of A and B. 
Using this notation, we have the following lemmas. 
Lema 1. The number of sets of polynomials [A, B] such that 
deg A = a, deg B = 3, (A, B) = 1, 


a+b patb—1 
= { p p for ab 0, 
pore for ab = 0. 

This lemma is a special case of a more general theorem to be proved else- 
where. For completeness we give the following simple proof. Let us classify 
the p*+ sets of polynomials [A, B], deg A =a, deg B =6, according to their 
g.c.d. Then, if i<a<b, 


(8) pret) = M| ¥(a — m, b — m), 


mSa 


the sum being extended over all M of degree <a. The right member of (8) is 


v(a, b) + — mM, m) 
m=1 
= ¥(a, b) + priate), 


whence the lemma. 
* The results of §§2, 3 hold for all p. 


i 
+ 
t 
it 
it 
ih 
Hy 
if 
if 
Wy 
iH 
1s 
| 
1 
a 
i 


400 LEONARD CARLITZ {April 


Lemna 2. Let F, A, B be of degree f, a, b, respectively; (A, B) =1. 

(I) If a+b<f, and a and B are two non-zero elements of GF(p") such that 
a+B +0, then the number of solutions of 
(9) (a + = aAU + BBV, | AU| =| BV], 


in polynomials U, V, is |F/(AB)| =p»/-2). 

(II) If kis an integer >f,a+b<k, and ais any non-zero element of GF (p*), 
then the number of solutions of 
(10) aF = AU — BV, | AU| =| BV| = p*, 
is 

= | F| | AB| 

It will suffice to prove (II) alone. From (10), we have 

(11) U =A’ (mod B), a’ < 8, 
V = B’ (mod A), b’ <a, 

where A’ and B’ are not necessarily primary. Since 


u=degU=k—a2b,andv2a, 


the congruences (11) may be written in the form 


U=A'+BU',V=B'+AV' =0' 


where now U’ and V’ are primary. Then (10) becomes 
aF — AA’ — BB’ 
AB 


U'—V’. 


(12) 


But since (A, B) =1, there is a unique pair of polynomials A’, B’, such that 
a'<b, b’<a, and the left member of (12) is integral. If then V’ be any 
(primary) polynomial of degree v’ =k —a—b, U’ is uniquely determined, and, 
retracing the steps that led from (10) to (12), U, V are uniquely determined. 
Since V’ can be chosen in 


nv’ — pn(k—a—b) 


ways, this proves case (II). The proof of case (I) is very much the same. 

3. Theorems on the p and w functions. We now prove certain formulas 
concerning the functions p,(F) and w,(F) defined in the Introduction. As we 
shall see in the next section, these formulas enable us to solve our problem 
concerning (1); furthermore, the formulas seem to be of some interest in 
themselves. 


1933] REPRESENTATIONS OF POLYNOMIALS 401 


TuHeEoreEo 1. If F is of even degree, 2k; a, B two elements of GF(p") such thai 
aB(a+f) <0; s, t two (real or complex) numbers; then 
(13) = Po+t41(F), 
(14) = 
(15) = Pe+t+i(F), 


where, in each instance, the summation is extended over all (primary) poly- 
nomials A, B of degree 2k, such that 


(a+ = aA + BB. 

The three formulas (13), (14), (15) may be proved simultaneously if p 
and w be expressed in terms of the function A,(F, \) now to be defined. We 
define the “character” \(B) by 
(16) = (— b = deg B, 
and the function A,(F, A‘) by 


(- 1)? m>k m=k 


(17) A,(F, = (1 


It is obvious from the definitions (3) and (4) that 
(18) p(F) = A,(F, 1), w.(F) = A.(F, d), 
and therefore the several parts of Theorem 1 reduce to 


(19) AAA, ACB, M4) = (F, d**4), 


(a+8)F=aA+$B 


i, 7 integers which may be taken =0 or 1. We proceed to establish (19). 
The left hand member of (19) is by (17) 


—1 


a=mu,b<v a=u, bev 


(20) 


where each summation is taken over all (primary) A, B, U, V satisfying 
(a+8)F =aAU+BBV as well as the conditions indicated under each }>. 
Call the sums respectively; then, since a<w is equivalent 
to a<k, 


| 
H 
| 
| 
i 
| 
if 
if 


LEONARD CARLITZ 


(a+B8)F=aAU+8BV 
a,b<k 


mck 
= | | 
M|P 


| A |-*| 
(a+8) FP M—|—aAU+8BV 
(A ,B)=1ja,b<k—m 


a(4)a(B)| A > 1 
(A ,B)=1 (a+8)F M—=—aAU+8BV 
a,b<k—m 


|FM—| xr(4)a(B)| Al Bl 


(A,B)=1 
a,b<k—m 


by case (I) of Lemma 2. By the definition of ¥(a, b) in Lemma 1, the last 
expression is equal to 


(22) | FM-| (— 
a,b<k—m 
Applying Lemma 1, the sum becomes 
k—m—1 “k—m—1 k—m—1 
b=1 a,b=1 
= [z —m, i], — ile p-*[k —m, i}! —m, ili. 

where, for brevity, we put 
1— (- 

1— (- 1)*p-** 
(= 1)*p-"* (- 


[k, i]; lk, iJ, §, 


i]. = 
(23) 


so that, by (21) and (22), 


m>k 
(24) M| [m — k, — je 
— p-*[m k, i]! [m — jt}. 


The treatment of >, is much the same; we have 


* For s=i=0, the symbol [k, i],=k. 


402 pe [April 
where 


REPRESENTATIONS OF POLYNOMIALS 


a<k,beek 


m<k 
M\F 
where now 
Su | A | -*| 
(a+8)F 
(A ,B)=1;,a< k—m=b 


(A ,B)=1 
a<k—m=b 


| FM-| DY (— 1) (a, k — et 
a<k—m 


| FM-| (- 1) (h—m) [k —m, i]. — m, il! 

[k, 7], having the same meaning as in (23); therefore 

m>k 

= | M | 1) 
PF 

(25) 

-{[m k, i], — p-*[m — k, i]. }. 
Similarly 


m>k 
= | M | #t#+1(— 
M\F 


(26) 
The sum }y is slightly different in that (A, B) may be of degree k; thus 


= | F| s+t | M | y, 


M\F 


Su = | A | -*| Bl 
(a+8)F M 
(A ,B)=1; a= 


(A ,B)=1 
a=b=k—m 


= |FM-| (— — m, — m) 


and therefore we have almost immediately 


m=k 
(27) | M | 
m>k 


+ > | M| — 1) — 
M\F 


| 
i 
where | 
q 
| 


404 LEONARD CARLITZ [April 


Substituting from (24), - - - , (27) into (20), we find that the left member 
of (19) is 
m=k m>k 
(28) a‘ti(M) | M | > a+i(M) | M| 
M\F M\F 
where 
xm = {1 — (— [1 — (— { [m — &, — 
+ {1 — (—1)ip-™} { [m — — pom [m — (— 
+ {1 — (= 1049-4} {lm — by — by (— 
+ (- 1) (m—b) (4 
=i— (- 
as may be verified without any calculation by applying (23) and then group- 
ing the terms in an obvious way. This evidently completes the proof of (19) 
and therefore of Theorem 1. 


We next prove a group of formulas that will be needed in §5 in deriving 
the expression sought for the number of solutions of (2). 


THEOREM 2. If F és of arbitrary degree, f; 2k is an even integer >f; ais any 
non-zero element of GF (p"); s, t two (real or complex) numbers; then 


where, in each instance, the summation is extended over all polynomials A, B of 
degree 2k, such that aF=A—B; p#(F), w, (F) are defined by (6) and (7), 
respectively. 


Exactly as in Theorem 1, the formulas (29), (30), (31) may be combined 
in a single relation involving the function A} (F, A) defined by 


(—1)\ 


\(F) being defined by (16). The equations (18) may then be replaced by 
pi (F) = AF (F, 1), wk = AFG, d), 
and the formulas (29), (30), (31) by 


(32) 4) = (1 


M\F 


1933] REPRESENTATIONS OF POLYNOMIALS 


(33) Aa(A, M4) = LF, AHA), 
aF=A—B 
the summation being over all A, B of degree 2k for which aF =A —B. 

The proof of (33) is very similar to that of (19), except that wherever 
Lemma 2 is necessary, we now use case (II). It is scarcely necessary to give 
the proof in detail. We begin exactly as in (20), and we shall consider only 
the first sum, >"; evidently 


aF=AU—BV 
a,b<k 


on >» | A | B| 
aF=AU—BV 
a,b<k 


M\F 


where, exactly as in (21), 


(A,B)=1 aF M'=AU—BV 
a,b<k—m a+u=b+ve2k—m 


pene | M | -1 | A | B| 
(A,B)=1 
a,b<k—m 


by case (II) of Lemma 2. Therefore 
Su = pene | M|-* SS (— 


a,b<k—m 


which may be evaluated by following the method applied to (22). Thus we 
find that 


M\F 


(34) -{[& — m, i].[& — m, — p-*[k — m, [k — m, 


m>f—k 


M\F 
[m+k—fj]i}. 
Similarly, we find that 


m>f—k 


M\F 


+k — f, i], — + k — file}; 


| 405 
i 
} 
E 
4 
if 
i 


406 LEONARD CARLITZ 


m>f—k 


(36) (ot ri+i(M) | M | s+t+1 
M\F 
+k — — + kk — f, fli}; 
m={—k 


Niti(F) (et t+1) { | M | 
M\F 


(37) 


m>f—k 


M| 


M\F 


Combining (34), - - - , (37) exactly as in (28) (the corresponding point in the 
proof of Theorem 1) we complete the proof of (33) and therefore of Theorem 

4. Number of solutions of (1). We begin with the case s=1 and then 
proceed by induction to the formulas (3) and (4) for general s. 


THEOREM 3. Jf a is an element of GF(p"),#0 or 1; F is of even degree, 2k; 
then the number of solutions of 


(38) (1 — a)F = X? — 
in (primary) X, Y of degree k, is 


m=k 


(I) >o1 for a a square in GF(p"), 


M\F 
(II) >o(— 1)™ for a a non-square in GF(p"). 
M\F 
The case (I) is almost trivial. Let a=’, B in GF(p") and ¥ +1, so that 
(38) becomes 
X X 
+8 BY Uv, 
i-f 
say. Evidently U and V are primary of degree k. But the number of solu- 
tions of F=UV, U and V of equal degree, is of course the number of divisors 
of F that are of degree k. Since U, V uniquely determines X, Y, this estab- 
lishes the formula (I). 
(II)* a is now not the square of any element of GF(p"); however it is a 
square in the Galois field of order p*", GF(p"), which contains the original 


* This case can be deduced from the general theory of quadratic fields over, worked out in 
detail by Artin, Mathematische Zeitschrift, vol. 19 (1924), pp. 153-246. However we shall make no 
use of this theory here. 


[April 


1933] REPRESENTATIONS OF POLYNOMIALS 407 


GF(p"). Put a=6?, so that @ is in GF(p*") but not in GF(p"); in particular 
6+ +1. Then as above 
X+60Y X—O6Y 


F= = UV, 
i+@ 1-6 


U and V now being over GF(p”*) and of equal degree. Put 
U=A+06B, V =A'+6B’, 


where A, B, A’, B’ are all over GF(p"); A and A’ are primary and of degree 
k; B and B’ are of degree less than & and not necessarily primary. Then 


X + 6Y = (1+ 6)(A + 6B), 

X — 6Y = (1 —06)(A’ + 6B’),. 
whence A = A’, B= —B’. Therefore we seek the number of solutions of 
(39) F = (A + 6B)(A — 6B), 


where A is primary of degree k, and B is of lesser degree and need not be pri- 
mary. This can be determined readily if we make use of two well known 
properties of polynomials over a Galois field: first, am irreducible polynomial 
over GF(p") factors in GF(p™) if and only if its degree is even; second, a 
polynomial over GF (p") can be expressed as a product of irreducible polynomials 
over GF (p") in essentially one way. 

Suppose now F =(Q', QO irreducible of degree g. Clearly if / and g are both 
odd, there are no factorizations (39); if g is odd but / is even, there is one such 
factorization. However if g is even, there are /+1 factorizations. In other 
words the number of solutions of (39) in this case is 


it (— 1)t+---+(— 
Similarly if F = 11Q', Q irreducible, the number of solutions of (39) is 
Il{1 + (— = 


M|F 
This completes the proof of formula (II). 
We are now able to prove our first principal result. 


THEOREM 4. If a1, -- +, as, Bi, , are non-zero elements of GF(p"), 
such that 
+0, 


F is of even degree, 2k; then the number of solutions of (1) is ps.(F) if 
(40) (— 1)*ay Be 
is a square in GF(p"); and is w,1(F) if (40) is a non-square in GF(p"). 


i 
> 

t 

| 
7 


408 LEONARD CARLITZ [April 


The case s=1 of this theorem is clearly true by virtue of Theorem 3 and 
the definition of po(F) and wo(F). Assume the theorem true for all values up 
to and including s. In order to effect the induction it is necessary to consider 
two cases: (I) for some j <s+1, 


e+1 


(41) yD = 755 
i=1 


(II) for no 7 is (41) satisfied. 
(I) Assume the notation is such that 7‘) =yi+ --- +y7,+0. By hy- 
pothesis ~0. If we put 
VDP) = yA + B, 
(42) 
|A|=|B| =|F|, 


then, since our theorem is assumed true for s, it is obvious that the number 
of solutions in question for s+1 is 
(i) (ii) 


(43) 
(iii) (iv) iwss(A)wo(B), 


according as 
(i) (5) and —a,44:8,4: are both squares, 

(ii) (5) is a square, —a,4:8,41 a non-square, 

(iii) (5) a non-square, —a,4:8,4: a square, 

(iv) (5) and —a,418.41 both non-squares; 
the sums (43) being taken over all A, B satisfying (42). If now we apply 
Theorem 1, it is clear that the induction is complete for case (I). 

(II) Since (41) is satisfied for no j, it is clear that yi=y2= - - - =Ye41, 
and therefore s is a multiple of ». As a consequence of this, 


Let us now put, in place of (42), 
(44) = + + Yo41)B*, | A| =| Bl =| FI; 
in place of (43) we now have 
etc., 
summed over all A, B satisfying (42). The induction is completed as in case 


(I). 


* Or F=—A-+2B. 


1933] REPRESENTATIONS OF POLYNOMIALS 409 


Coro.iary. Let F be of degree 2k over GF(p"), s not a multiple of p. Then 

the number of solutions of 
2sF = XP 4+XP +---+ Xe 
in (primary) X; of degree k is ps(F) if 
(i) s is even, 

(ii) s is odd, n is even, 

(iii) s and n are odd, p=1 (mod 4); 
the number of solutions is w,1(F) otherwise, that is, if 

(iv) s and n are odd, p=3 (mod 4). 

5. Number of solutions of (2). Our second principal result is contained 
in the following theorem. 


THEOREM 5. If a, a1, , Qs, , 8, are non-zero elements of GF(p"), 
such that 


+7 = 9; 


F is of arbitrary degree, f; 2k is an even integer >f; then the number of solutions 


of (2) is 


according as (40) is or is not a square in GF(p"). 
Take first s=1; we may write (2) in the form 
(45) aF = X?—Y?, deg X =deg Y=k 
(so that (40) is necessarily a square). But (45) is equivalent to 
F = UV, deg U = k, deg V = f — k;* 

therefore the number of solutions of (45) is the number of divisors of F of 
degree f—R, i.e. 

m=f—k 

1 = (F). 
M\F 


Since (40) is necessarily a square, our theorem holds for s =1. 
For s>1, we make use of Theorem 4. Since y =0, 7:0, plainly y —y:1+0. 
Let us put 


(46) aF = 7A + (y — 71)B, deg A = deg B = 2k. 


* U and V are of course primary. 


| 
it 
i 
i 


410 LEONARD CARLITZ 


By Theorem 4 we may express the number of solutions of 
= + (y — = eX? +---+6.¥?, 
in terms of po(A), wo(A); ps-2(A), ws-2(A), respectively. Thus, if —a,8; and 
(—1)*-'a, - - - 8, are both squares, the number of solutions of (2) is 
)ps-2(B) 
summed over all A, B satisfying (46). Applying Theorem 2, this sum is 
which proves the theorem in this case. The proof is exactly the same in each 


of the remaining three cases and need not be repeated. This completes the 
proof of Theorem 5. 


Coro.iary 1. If all the hypotheses of Theorem 5 are true, and in addition 
k>f, then the number of solutions of (2) is 


(1 > | M | s—1 


M\F 


1)-™| M | s-1 


according as (40) is or is not a square in GF(p"), the summations now being 
taken over all M dividing F. 


Corotiary 2.* Let F be of degree f over GF(p"), 2k an even integer >f, s a 
multiple of p, a~0. Then the number of solutions of 


oF = XP +---+ Xe 
in (primary) X ; of degree k is 
if ns(p—1)/2 is even; the number of solutions is 
if ns(p—1)/2 is odd. 
* Cf. the corollary to Theorem 4, 


UNIVERSITY, 
DuruaM, N.C. 


or 


ON THE CLASS NUMBER OF A CYCLIC FIELD* 


BY 
CLAIBORNE G. LATIMER 


1. Introduction. Let Q be the field defined by a primitive mth root of 
unity, m an integer >2, and let F be a subfield of Q. In a recent article,{ 
Gut showed that if F is real, the class number may be written h=6/R, where 
R is the regulator of F and 6 is a product involving certain group characters. 
If F is imaginary, he showed that h=/-/e, where h, is a closed expression 
and h,=8/R, 5 and R being as before. If F = Q and m is an odd prime, Gut’s 
h, and hz are the same, except perhaps for sign, as Kummer’s well known first 
and second factors of the class number. 

We shall assume hereafter that the Galois group & of F is cyclic. In this 
case, as noted by Gut, the 6 in his expression for h, or 42, may be written as a 
determinant. Employing this determinantal form, we shall show that 6/R, 
and hence / or Jz, is equal to N(r)/N(&), where N(&) is the norm of a non- 
singular ideal &, in a set & of elements in a certain commutative algebra, and 
N(r) is the norm of a principal ideal {r} in G, 7 being an element in &.t 

In certain cases our results may be expressed in terms of an ideal in a 
cyclotomic field. (See Theorem 2.) For the case where F is a cubic field, the 
discriminant of which is the square of a prime, Theorem 2 is equivalent to 
Eisenstein’s result that the number of classes of certain “associated (cubic) 
forms” is h=y*—pv+v?, where y, v are rational integers.§ 

2. The ratio of two determinants. Let F be of degree E and let s be a 
generating substitution of &. If @ is a number of F, not rational, it will be 
understood that 6=s‘(@) (i=1, 2,---, E), 0 =0 =6. Let e=E or 
e=E/2 according as F is real or imaginary. Then @‘‘+® is the conjugate 
imaginary of (i=0,1,2, ---,e-1). 

Let m1, 72, - - - , nn» be a fundamental set of units of F. By Dirichlet’s well 
known theorem, » =e—1. Since every 7 belongs to F, 


* Presented to the Society, December 28, 1931; received by the editors August 27, 1932. 

| Die Zetafunktion, die Klassenzahl und die Kronecker’sche Grenzformel eines beliebigen Kreis- 
kérpers, Commentarii Mathematici Helvetici, vol. 1 (1929), p. 160. 

t It will be understood that we use the same definitions of terms referring to ideals in & as are 
given by MacDuffee in his article Am introduction to the theory of ideals, etc., these Transactions, vol. 
31 (1929), p. 71. In case GY is a set of integral algebraic numbers, these definitions are equivalent to the 
usual definitions. 

§ Journal fiir Mathematik, vol. 29 (1845), p. 49. 


411 


ay 
if 
if 
be 


C. G. LATIMER [April 


(1) ni = unt nf? (¢=1,2,---,m), 
where 4; is a root of unity and the a’s are rational integers. Let the mth order 
matrix A =(a;;) and let J be the identity matrix. 

Lemma 1. A is a root of 
(2) f(x) = + 
and it is not a root of an equation of lower degree with rational coefficients. 

By (1), if 


(k) (k) (k) ‘ 


where is a root of unity and the matrix (of) =A*, Since nin! -- 
= +1, it follows that A is a root of 
If F is real, it follows that A is a root of (2). Suppose F is imaginary. Then 
fi(A) = f(A)(4* + 1) = 0. 
To prove that A isa root of (2), it suffices to show that A*+/ is non-singular. 
Let A*+J=(6;;). We have 

nn? = nf (i = 1,2,---, n), 
where 2; is a root of unity. Suppose (@;;) is singular. Then the system of equa- 
tions 


=0 


j=1 


has a solution in rational integers, not all zero, and 


= 


and every ¢ is a root of unity. Let lg @ be the real logarithm of |@|. Then 
lg 0 =1g @ and, since =1, 


=0 (¢ = 0,1,2,---,#— 1). 


j=l 


From this it follows that the regulator, R= + lg ni igni---lg nin | 
(t=1, 2,- +--+, m), of F is zero.* But this is known to be false. Hence A*+J 
is non-singular and A is a root of (2). 


* We take the same definition of R as that used by Gut, loc. cit., p. 200. 


412 
= 
| | 


1933] THE CLASS NUMBER OF A CYCLIC FIELD 413 


It may be shown by the same method employed by Pollaczek on a similar 
problem*, that A is not a root of an equation of degree <m with rational coef- 
ficients. The lemma follows. 

Let 1, X2, - - - , X, be independent variables and let 


(k) (k) (k) 
x 


For a fixed k, the matrix of the forms x; is the transpose of A*. By Lemma 1, 
A*=I. Thus we have a cyclic group of linear homogeneous substitutions 
S, S?, - - - , S*=1, on the x’s. For every pair of integers i, k, 


it being understood that if 7=7; (mod e), OSji<e, then x69 =x, x69 =x, 
If @ is a unit of F, by (1) and (3) 
(5) 


(n) (n) (n) 


where the w’s are roots of unity and the ~’s are rational integers. It will be 
observed that if we apply a substitution s‘ to 6, the resulting unit is the 
same, except perhaps for a factor which is a root of unity, as that obtained by 
applying the substitution S‘ to the x’s when @ is written as in the first equa- 
tion above. 

If 0<t<e and if i+k=# (mod e), by (4) and (5), 


where 4 is a root of unity and z;=x; (j=1, 2, - - - , m). Let the determinant 
of the x’s in the first m equations of (5) be V(x, x2, - - - , tn) and let 

lg lg eee lg 

lg 0’ ---lga™ 
(6) 60) =| 


lg Ig lg 


* Mathematische Zeitschrift, vol. 21 (1924), pp. 8, 9; Bulletin of the National Research Council, 
No. 62, Algebraic Numbers, II, pp. 94-96. 


| 
¥ 
3 


414 C. G. LATIMER [April 


Employing (6) and the same rule for the multiplication of determinants as 
for matrices, we find x2, - - - , R= +6(0). Hence* 


(7) (x1, Xe Xn) R 


3. The set G. The algebraic roots of (2) are distinct and hence the same 
is true of the factors of f(x) which are irreducible in the rational field. 

Let C be any matrix such that (2) is the equation of minimum degree 
with rational coefficients, the leading coefficient being unity, which has C as 
a root. Let @ be the set of all polynomials in C with rational integral coeffi- 
cients. It has been shown that there is a one-to-one correspondence between 
the classes of ideals in G and certain classes of matrices.t Since (2) is the 
minimum equation of A, by the proof of this result, there is a non-singular 
ideal &, in G, with a basis wi, we, - - - , w, such that 


= + awe +--+ + ainw, (i = 1,2,---, 
Let 7 be an element in &. By the last equations and (3), 


= + +++ + 


+ we + + Xn wn, 


where the x’s are rational integers. The determinant of the coefficients of the 
w’s is V(x1, %2, , Xn). 
I,C, C*, - - - form a basis of G. Hence 


w= gal + +--- + 


where the g’s are rational integers such that the absolute value of the deter- 

minant | g;;| is the norm of &. If we employ the last equations to eliminate 

the w’s in the above expressions for C*'r(i=1, 2, - - - , m), in the resulting 

equations the determinant of the coefficients of the powers of C is V(x, 

2, &n)- N(R). But the form a basis of the principal ideal {7}. 

Hence V(x1, %2, - - , tn): N(R) = +N(r). Since & is non-singular, N(R) #0. 
Therefore by (7) we have 


* By employing Lemma 1, it may be shown that ¥ is an invariant of the above-mentioned sub- 
stitution group. See Fricke, Lehrbuch der Algebra, vol. 2, p. 14. 

{ Latimer and MacDuffee, A correspondence between classes of ideals and classes of matrices, 
Annals of Mathematics, vol. 34 (1933). 


Cr 
Co, 
(i = 1,2,---,m), 


1933] THE CLASS NUMBER OF A CYCLIC FIELD 415 


LEMMA 2. If - - 1s a unit of F, where uis a root of unity, 
then 


+ W(x1, %2,° Xn) 
R N(R) 


where R is a non-singular ideal in ©, with a basis w;, we, - - + , Wn Such that 
Cw; = + (i = 1,2,---, m) 
and N(r) is the norm of the principal ideal {r}, 7 + +2nWn. 


4. Proof of principal theorem. Let 8 be the group which has as its ele- 
ments the ¢(m) integers in a reduced set of residues, modulo m. The numbers 
of F are those numbers of 2 which are unaltered under every substitution 
(p, p*), where p is a primitive mth root of unity and a is an integer in a sub- 
group Ul of ®. Let the co-sets (Nebengruppen) of ® with respect to Ul be 
Uo=U, Ui, Us, - -- , Then where the y; are properly chosen 
integers. The factor group #/U is simply isomorphic with %,* which by 
hypothesis is cyclic. Hence we may assume that s=(p, p’), where y is an 
integer such that y”¥ =a (mod m), a an element in Ul. If m is odd, we may 
assume that ¥ is odd, while if m is even, y is necessarily odd since the same is 
true of a. 

If F is real, by Gut’s results, =A/R wheret 


m/2 


(8) A= [Td — x(&) log 


xX kel 


In the product, x ranges over all the elements, except the identity element, 
of a group of characters which is simply isomorphic with %. Since Y is cyclic, 
we have 


n m/2 


(9) A= > — x*(&) log 
t=1 k=l m 

where x is a fixed character. It may be shown that if a and 6 are prime to 
m, x(a) =x(b) if and only if a and 6 are congruent, modulo m, to elements in 
the same co-set U;. After proper choice of notation, we may assume that if @ 
belongs to U;, x(a) =¢‘, where ¢ is a primitive eth root of unity. Employing 
x(m—k) =x(k), x(k) =O if (m, k) >1, and x‘(k) =0 (0<t<e), it may be 
shown that 


m/2 


ark m—1 n 
(10) 2 Dox'(k) log sin— = Dox'(k) lg (1 — p*) = rz, 
m kel 


k=1 i=0 


* Weber, Lehrbuch der Algebra, 2d edition, vol. 2, p. 75. 
t Gut, loc. cit., pp. 200, 223. 


| 
m 
| 
i” 
| 
q 
ihe 


416 C. G. LATIMER [April 
where \)=II(1—p*), @ ranging over all the elements of U, and \;=d? 
(i=1, 2, - - - , m). Employing a well known property of cyclic determinants, 
it may be shown from (9) and (10) that 


A = + 800), 


t=1 


where 6 = (Ai/Ao)/2.* Hence h= +6(6)/R. We shall show that @ is a unit of 
F. Since F is real, U contains —1. Therefore @ is a product of units in the 


form 
(1 — p*)(1 — 1 — p* 


Since y is odd, the unit on the left belongs to Q, and hence the same is true of 
9. Since @ is unaltered under every substitution (p, p*), a in U, it belongs to 
F, 


If F is imaginary, Gut’s expression for h may be written 4=/,- he, where 
h, is a closed expression and /,=A/R, where A is exactly the same as the 
right side of (8), except that in this case x ranges over those characters, 
except the principal character, such that x(—1)=1.f The whole group of 
characters is simply isomorphic with & and hence every character is a power 
of one of them. For a-generating character x, we have x(—1) = —1. Hence 
where 


n m/2 


A= J] > — x*(&) log a. 


tel kewl 


Since s*(9) is the conjugate imaginary of 0, the co-set U, contains —1 
and we may take as the elements of U;,, the negatives of the elements in the 
corresponding If a and are prime to m and ais in U;, then x?(a) = 
if and only if } is congruent to an element in U; or in U;,.. The notation for 
the co-sets may be so chosen that if a belongs to U; then x*(a) =f, where [ 
is a primitive eth root of unity. If we define the \; as before, let @ =(Ai-Aeq1 
/do:A-) and employ the fact that A;,. is the conjugate imaginary of \,, 
we find as before that A= + 6(0), 42 = +6(0)/R and @ is a real unit of F. By 
Lemma 2, we have then the following, except the last sentence. 


THEOREM 1. Let F be a field, of degree E, which is cyclic with respect to the 
rational field. Let e= E or e=E/2 according as F is real or imaginary, and let 


* For a special case of this, see Fueter, Die Klassenzahl syklischer Kor per, etc., Journal fiir Mathe- 
matik, vol. 147 (1917), p. 183. 
t Gut, loc. cit., pp. 201, 223. 


1933] THE CLASS NUMBER OF A CYCLIC FIELD 417 


n=e—1. Let & be the set of all polynomials with rational integral coefficients in 
the nth order matrix A =(ca:;), where the a’s are given in (1). If F is real let H be 
the class number of F, and if F is imaginary let H be the absolute value of Gut’s 
second factor of the class number. Then 


H = N(r)/N(&), 


where N(S) is the norm of a non-singular ideal 8 in © and N(r) is the norm of 
a principal ideal {r} in G, r being an element in &. If F is the field defined by a 
primitive mth root of unity, m an odd prime, +H is Kummer’s second factor of 
the class number. 


To prove the last sentence of the theorem, it suffices to note that our 0, 
5(0), R, when properly specialized, are identical, except perhaps for sign, 
with Kummer’s e(a), D, A respectively.* 

It will be observed that by the proof of the above theorem, +H is repre- 
sented by the form V(x, x2, - - - , x,), which, as previously noted, is an in- 
variant of the cyclic substitution group defined by the transpose of A. 

5. A special case of Theorem 1. Suppose e of Theorem 1 is an odd prime, 
F real or imaginary. Let ¢ be a primitive eth root of unity. ¢ is a root of (2), 
and 1, ¢, - - - , form a basis of the integral numbers in the field K 
defined by ¢. Hence by Lemma 1, © is equivalent to the set of all integral 
algebraic numbers in K. Then, by well known theorems in algebraic num- 
bers, there is an ideal 2 such that {r} = RZ and N(r) = N(R) -N (2). We have 
then 


THEOREM 2. If e in Theorem 1 is anodd prime, 
H = N(), 
where 2 is an ideal in the field defined by a primitive eth root of unity. 


If F is the field defined by a primitive mth root of unity, m an odd prime, 
and if e=(m—1)/2 is also an odd prime, it may be shown that Kummer’s 
first factor of the class number is the norm of a principal ideal in K, K as 
above. Hence the class number of F is the norm of an ideal in K. 


* Journal fiir Mathematik, vol. 40 (1850), pp. 110, 99; Bulletin of the National Research Coun- 
cil, loc. cit., p. 34. 


UNIVERSITY OF KENTUCKY, 
LExincTOoN, Ky. 


4 

| 

R 

7 

4] 

a 

i 

i 

4 


THE BOUNDARY VALUES OF ANALYTIC FUNCTIONS. II* 


BY 
JOSEPH L. DOOB 
Let 


(0.1) Silz), fol), 


be a uniformly bounded sequence of functions analytic for |z|<1. By a 
theorem of Fatou,{ lim,..: f,(ve) exists almost everywhere on the interval 
0<t<2z, defining a boundary function F,(e) =lim,.:f,(re“) almost every- 
where on |z|=1,z=e%. A new sequence 


(0.2) F(z), F2(z), - 


is thus determined. What are the relations between these two sequences? 
More generally, let the sequence (0.1) consist of functions meromorphic for 
|z| <1. In §2 below, a boundary function 7,(z) will be defined at every point 
of |z|=1 for any function f,(z) meromorphic for |z | <1. A new sequence 


(0.3) F(z), F2(2), 


is thus determined. What are the relations between the sequences (0.1) and 
(0.3)? 

|The following questions are closely related to these two. Let f(z) be a 
bounded function, analytic for |z| <1, with Fatou boundary function F(z), as 
defined above. Let P be a point on |z|=1. Then what are the relations be- 
tween f(z) and F(z) in a neighborhood of P? More generally let f(z) be mero- 
morphic for |z|<1. In §2 below, a boundary function 7(z) of f(z) will be 
defined at every point of |z|<1. What are the relations between f(z) and 
F(z) in a neighborhood of P? 

The purpose of this paper is to treat these four questions. Before treating 
them, however, a number of definitions, some new and some old, will be made 
in the following two sections. 


1. METRIC DENSITY AND APPROXIMATE CONTINUITY 


In a previous paper{ applications of the concepts of mean metric density 
and approximate continuity to complex function theory were made by the 

* Presented to the Society, March 25 and March 26, 1932; received by the editors, February 6, 
1932, and, in revised form, September 30, 1932. 


+ P. Fatou, Acta Mathematica, vol. 30 (1906), pp. 366-367. 
¢ These Transactions, vol. 34 (1932), pp. 153-170. 


418 


BOUNDARY VALUES OF ANALYTIC FUNCTIONS 419 


author. The following lemma will be used in discussing further applications. 


Lemna 1.1. Let E be a point set on the interval —1<x <1 having lower and 
upper mean metric density 5:1, 5, respectively at x =0. Let E become E' under the 
transformation x'=(x) and let E’ have lower and upper mean metric density 
5/, 6,2 respectively at ¥(0) =0. 

(a) If (x) =d¥(x)/dx is continuous for —1<x<1, ¥'(0) >0, then 


(1.11) = = by. 


(b) If W(x) =x" for x20, v=1, and V(x) =—|x|” for x<0, then 5,=1 
implies that 6, =1. 
The simple proof of this lemma will be omitted here.* 


Coro.iary. If E has lower and upper metric densities on the right 61, Sur, 
respectively at x=0 and if E’ has lower and upper metric densities on the right 
, bur respectively at x =0, 


(1.12) Sir = Sir, = Sur, 
in Case (a) and bu,=1 implies by =1 in-Case (b). 


We can suppose that £ has no points to the left of the origin, when the 
corollary follows immediately from the lemma. 

Let F(z) be a measurable function defined almost everywhere on |z| =1. 
The idea of approximate continuity will be slightly extended as follows. If the 
set of those points at which | F(z) —a| < has upper mean metric density 1 at 
Zo for some complex number a and for all positive numbers e, F(z) will be 
said to be quasi-approximately continuous at 2) with limit value a there. 
F(z) may be quasi-approximately continuous at a point with several limit 
values there. 

Let Fi(z), F2(z), - - - be a sequence of measurable functions defined on a 
set of positive measure E on |z| =1. The sequence is said to converge in meas- 
ure to a (measurable) function F(z) when the measure of the set of those 
points for which | F(z) —F,(z)| 2 approaches 0 with 1/m for every positive 
number e. If the sequence is uniformly bounded, one necessary and sufficient 
condition for this is that F(z) be bounded and measurable on £ and that 


lim | ds| =, 
no J 


and another that every subsequence of the sequence { F,,(z) } contain a further 
subsequence converging almost everywhere on E to F(z).+ 


* Cf. the proof of Lemma 2.1 in the previous paper. 
t F. Riesz, Paris Comptes Rendus, vol. 148 (1909), pp. 1303-1305. 


} 

¥ 

17 

fy 


420 J. L. DOOB [April 


Lema 1.2. Let {F,(z)} be a sequence of measurable functions defined on a 
measurable set E on |z| =1, mE>0. A necessary and sufficient condition that 
the sequence converge in measure on E to the measurable function F(z) is that 


lim B{ | F(z) — F,(z)|, En} = 0* 
for every sequence {E,} of measurable point sets on |z| =1 such that E,c E, 
n=1,2,---, and such that 


lim inf mE, > 0. 
This result is an immediate consequence of the definition of convergence 
in measure. 
It will be seen in §5 that the concepts of approximate continuity and con- 
vergence in measure are related to each other. 


2. CLUSTER VALUES OF FUNCTIONS AND OF SEQUENCES 


In the following, points of the extended plane, or of the sphere correspond- 
ing to it by stereographic projection, will be considered. “Closed,” “open,” 
etc., used of point sets of the plane, will refer to the corresponding point sets 
on the sphere. The point © is then in no way exceptional, and is allowable as 
a value assumed by a function. 

Let f(z) be a single-valued function defined in a domaint y bounded by a 
simple closed Jordan curve I (i.e., a one-to-one and continuous image of the 
perimeter of a circle). Let P be a point on I’. Then if there is a complex num- 
ber a and a sequence of points {z,}, in y, such that 
(2.01) lim z, = P, lim f(z.) = a, 

a is called a cluster value of f(z) in y at P. The set of all cluster values of f(z) 
in y at P is called the cluster set of f(z) in y at P. This set is closed and con- 
nected if f(z) is continuous in y. The function F(z), defined for every point P 
on I’, as the cluster set of f(z) in y at P will be called the cluster boundary 
function of f(z). It is evidently multiple-valued, in general. The function 
f(z) is said to have the cluster value a on a given path to P if there exists a 
sequence of points {z,} on that path, so that (2.01) is satisfied. If y is the 
interior of the unit circle, |z| <1, the path will be called non-tangential if it 
is contained in some angle with vertex at P whose sides are chords of |z| =1. 

* Throughout this paper if F(z) is a function defined on a set E, B{ |F(s)|, EZ} will denote the 

greatest lower bound of |F (z) | on E, and O{F (z), E} will denote the oscillation of F(z) on E, i.e. 


the least upper bound of |F(P)—F(Q) | for P, Q any two points of E. 
{ In this paper, any open connected point set will be called a domain. 


1933] BOUNDARY VALUES OF ANALYTIC FUNCTIONS 421 
If the path is a continuous curve C* and if there is only a single cluster value 
of f(z) on C: 
(2.02) lim f(z) = a 

s—P 
when z approaches P on C, f(z) is said to have the convergence value a at P. 


If there is a complex number a, a sequence of points {z,} on I’ in a neigh- 
borhood of P, all on one side of P and different from P, such that 


(2.03) lim z, = P, lim F(z,) = @ 


(choosing one definite value of ¥(z,) for each value of 2), a is called a cluster 
value of f(z) on I at P on the side in question. The cluster sets of f(z) on T' at 
P on each side are then defined as the set of all the cluster values of f(z) on 
that side, and the cluster set of f(z) on I’ at P is the sum of these two sets. If 
f(z) is continuous in 7, the cluster sets of f(z) on T' at P on each side are closed 
and connected. If EZ is a point set on I which has P as a limit point, and if in 
(2.03) the points {z,} all belong to EZ, the set of all values a thus determined 
will be called the cluster set of f(z) on I on E at P. These ideas were intro- 
duced by Painlevé.t 

It does not seem to have been realized that the above definitions are 
analogues of certain definitions for sequences of functions, defined in the 
interior of the unit circle. Let 


(2.04) fiz), fe(z), 


be a sequence of single-valued functions defined for |z| <1, with cluster 
boundary functions 7;(z), ¥2(z), - - - respectively on |z| =1. Then if there 
is a complex number a, a subsequence {f,,(z)}, and a sequence of points 
{z.,} in |z| <1, such that 
(2.05) lim fa, (Za,) = @, 
a will be called a cluster value of the sequence (2.04) in |z| <1. If ga(z) is 
defined by 

— t 
(2.06) gn(z) = 


1 
and if 


* That is, C is determined by z=y(¢) where y(#) is continuous for 0S#S1, s=y(#) is in y for 
O<t<1, ¥(1)=P. 

{ P. Painlevé, Paris Comptes Rendus, vol. 131 (1900), p. 489. 

t The conjugate complex number of ¢ is denoted by £. 


no 
"4 
4 
! 


422 J. L. DOOB 


(2.07) lim g,(z) = @ 
uniformly in every closed subregion of |z| <1, a will be called a convergence 
value of the sequence. The sets of all cluster and convergence values of the 
sequence in |z| <1 will be called the cluster and convergence sets in |z| <1, 
respectively. The former set is closed. 

If there is a complex number a, a subsequence {f,,(z)}, and a sequence of 
points {z.,} on |z| =1, such that 
(2.08) lim Fo,(Za,) = 
(choosing one definite value for 7,,(2.,) for each value of ), a will be called 
a cluster value of the sequence (2.04) on |z| =1. The set of all cluster values 
of the sequence on |z| =1, which we designate as the cluster set of the 
sequence on |z| =1, is closed. 

Let {A,} be a set of open arcs on |z| =1. Then if a is a cluster value of 
the sequence (2.04) in |z| <1 in accordance with the definition given above 
and if under the transformation 


(2.09) 


the arc A,, becomes the arc A,’ such that 


(2.10) lim inf mA.,’ > 0, 


a will be called a cluster value of the sequence in |z| <1 with respect to the 
arcs {A,}. If in particular 


(2.11) lim mA,,’ = 


2 


a will be called a strong cluster value of the sequence in |z| <1 with respect 
to the arcs {A,}. If 


lim sup | z.,| < 1, 


the conditions (2.10) and (2.11) are equivalent to 


lim inf mA,, > 0, lim mA,, = 2r, 


2 


respectively. If 
(2.12) A, = Ai," > 1, mA, < 27, 


then in the case (2.10), a convergent subsequence of the sequence {2,,} must 


[April 
3 =— Za, 
1 
| 


1933] BOUNDARY VALUES OF ANALYTIC FUNCTIONS 423 


approach (i) a point of |z| <1, or (ii) a point of Ai, or (iii) an end point of Aj, 
remaining in the angle between A; and some chord through that end point. 
In the case (2.11), still assuming (2.12), a convergent subsequence of the 
sequence { 2a, } must approach (i) a point of Ai, or (ii) an end point of A), 
approaching the end point tangentially—on the same side of the end point 
as A;. Conversely the conditions given are sufficient that a be a cluster value 
or a strong cluster value of the sequence {f,(z)} with respect to the arcs 
{A,} respectively. There is only slight modification of these criteria if 
lim inf,... mA,>0. The sets of all cluster values and strong cluster values 
with respect to a set of arcs will be called the cluster set and the strong cluster 
set of the sequence with respect to those arcs, respectively. It is not hard to 
show that the latter is a closed subset of the former. If @ is a cluster value 
with respect to a set of arcs on | z| =1, i.e. if (2.05) and (2.10) are satisfied, 
and if (2.07) is also satisfied, a will be called a convergence value of the se- 
quence with respect to those arcs. The convergence set with respect to the 
arcs will then be the set of all these convergence values. The convergence set 
with respect to a set of arcs is a subset of the strong cluster set with respect 
to the arcs. For if a is a convergence value with respect to the arcs {A,} we 
can suppose that lim inf,..mA.,,>0 (or we could use the sequence deter- 
mined by (2.06)). By (2.07) there is then a subsequence {f;,(z)} of {f.,(z) } 
and a sequence of points {,} such that lim,...|£,,| =1 and such that 
limy.. fo,(£s,) =a. We can suppose that &, is so chosen that the distance 
from &,, to the midpoint of the arc A», approaches 0 with 1/n. Then a is a 
strong cluster value of the sequence with respect to the arcs {A,}, by the 
criterion suggested above. 

If « is a cluster value of the sequence (2.04) on |z| =1 in accordance with 
(2.08) and if the point z,, lies on the arc Aa, for all values of m, a will be called 
a cluster value of the sequence on |z| =1 on the arcs {A,}. The set of all 
these cluster values will be called the cluster set of the sequence on the arcs 
considered. This set is closed. 

A point a@ will be said to be assumed by the sequence (2.04) if every func- 
tion of the sequence except for at most a finite number assumes the value a. 
A point a will be said to be exceptional to or omitted by the sequence if at 
most a finite number of the functions assume the value a. 


3. THE PROPERTIES OF THE BOUNDARY FUNCTIONS OF A UNIFORMLY BOUNDED 
CONVERGENT SEQUENCE OF ANALYTIC FUNCTIONS 


Let 
(3.01) Silz), fo(z), 


4 
: 
qj 


424 J. L. DOOB [April 


be a uniformly bounded sequence of functions analytic for |z| <1, with Fatou 
boundary functions 


(3.02) F(z), Fe(z), 


respectively. What are the relations between these two sequences? The 
sequence (3.01) forms a normal family.* If it is uniformly convergent in every 
closed subregion of |z| <1 to the limit function f(z), we can reduce the prob- 
lem to that in which the limit function vanishes identically by substituting 
the sequence {f,(z)—f(z)} for {f,(z)}. This will be convenient in much of 
what follows. The problem solved in this section is the following. Necessary 
and sufficient conditions are found on the sequence (3.02) that the sequence 
(3.01) converge uniformly in every closed subregion of |z| <1 to a function 
f(z), where, if convenient, the limit function f(z) is supposed to vanish 
identically. It will be seen that any domain bounded by a simple closed 
rectifiable Jordan curve could be used instead of |z| <1 as the domain of 
definition of the functions of the sequence (3.01). 

Lemna 3.1. Let fi(z), fo(z), - - - be a uniformly bounded sequence of functions 
analytic for |z| <1, with Fatou boundary functions F,(z), F2(z), - - - respec- 
tively, | F,(z)| $1, =1, 2, - - - . Then if there is a sequence of points {z,} such 
that 


(3.101) lim sup | z,| <1, lim | fa(z,)| = 1, 


nwo 


it follows that 
(3.102) lim | f.(z)| = 1 


uniformly in every closed subregion of |z| <1 and that the sequence {| F,(z)| } 
converges in measure to 1 on |z| =1. 

The sequence {f,(z)} forms a normal family, as remarked above, and 
any limit function f(z) must satisfy the two inequalities 


(3.103) | f(z)| $1, | f(o)| = 1, | 20] <1, 


where 2 is a limit point of the sequence {z,}. It follows from the maximum 
principle that | f(z)| =1. Then every limit function of the sequence is a con- 
stant of modulus 1. If the sequence {|/,(z)| } did not converge uniformly to 
1 in every closed subregion of |z| <1, there would be a closed subregion R, 
and a positive number p <1, such that 


(3.104) | fa,(Ea,) | p 


* P. Montel, Lecons sur les Familles Normales, Paris, 1927, p. 21. 


(n = 1, 2,---), 
| 


1933] BOUNDARY VALUES OF ANALYTIC FUNCTIONS 425 


for some subsequence {f,,(z)} and a sequence {£,,} of points in R. From 
{fa,(2) } can be extracted a further subsequence {f,,(z)} such that lim,.. 
|fo,(2)| =1 uniformly in R, contradicting (3.104). The sequence {|f,(z)| } 
therefore converges uniformly to 1 in every closed subregion of |z| <1. In 
particular lim,.., |f,(0)| =1. Now by the Cauchy integral formula, 


1 
(3.105) | f(0) | J del 1, 


so that 


(3.106) 0 = lim [1 —| f,(0)| ] = tim — f [1 —|F,(z)| ]| dz| = 0, 


which proves that the sequence {| F,(z)|} converges in measure to 1 on 
|z| =1.* 

THEOREM 3.1. Let f(z), fo(z), - - - be a uniformly bounded sequence of func- 
tions analytic for |z| <1, with Fatou boundary functions F,(z), F2(z), -- - 
respectively, | F,(z)| $1, m=1, 2,---. 

(a) If there is a sequence of points {2,} such that 
(3.111) lim sup | | <1, lim fa(zn) = a, | = 1, 


it follows that 
(3.112) lim f,(z) = @ 


uniformly in every closed subregion of |z| <1 and that the sequence {F,(z)} 
converges in measure to a on |z| =1.f 

(b) If the sequence {F,(2)} converges in measure to a, |a| <1, on a measur- 
able set E on |z| =1, mE>0, the conclusions of (a) hold. 


If we take |a| =1 in (b), the condition of (b) is necessary and sufficient 
that (3.112) hold, so Theorem 3.1 solves the given problem in a very par- 
ticular case. 

(a) We can assume that a=1. The result (a) is then simply Lemma 3.1 
applied to the sequence {¢,(z)}, where 


(3.113) = efn-1, 


* Cf. §1. 

¢ Asimilar theorem was proved by J. L. Walsh (who uses the term quasi-convergence instead of 
convergence in measure), and applied in another connection, these Transactions, vol. 32 (1930), 
pp. 378-379. 


no no 
no 
qi 


426 J. L. DOOB [April 


(b) To prove (b) it is only necessary to prove that the hypothesis of (b) 
implies (3.112). By a theorem of Khintchine and Ostrowski* the actual con- 
vergence of the sequence {F,(z)} almest everywhere on E implies (3.112). 
The proof as given in Bieberbach’s book also proves the more general result 
desired. A still more general result will be useful, however. Let E,(¢€) be the 
set of those points on |2| =1 for which | F,(2)—a| <«. Then it is sufficient in (b) 
if 


(3.114) lim lim sup | = 0. 

n— 2 
This follows from the Ostrowski-Nevanlinna inequality, or the proof referred 
to above can be modified to prove this also (by choosing the constant A used 
in it properly). It is sufficient for (3.114) that 


lim inf | lim inf mE | > 0, 


e—0 


no 


and it is this special case which will be used most in the applications in this 
paper. 

Corotiary. In the above theorem if wr=fn(2n) for large values of n: 
n=n(p), is outside every circle C, tangent to |\w| =1 at w=a, of radius p<1, 
the same will be true of the values of w=f,(z) for z in any fixed closed subregion 
of |z| <1 and the measure of the set of those points on |z| =1 at which w=F,,(z) 
is inside C, approaches 0 with 1/n for every value of p <1. 


We can suppose that a=1. The corollary is simply the theorem applied to 
the sequence {¢[f,(z)]} where ¢(w) is defined by 


o(w) = e (etl) /(w—1) 


THEOREM 3.2. Let f:(z), fe(z), - - - be a uniformly bounded sequence of func- 
tions analytic for |z| <1, with Fatou boundary functions F,(z), F2(z), - - - 
respectively. Suppose that f,(z)#0,m=1,2,---. 

(a) If there is a sequence of points {2,} such that 


(3.21) lim sup | zn] <1, lim fa(zn) = 0, 


it follows that 
(3.22) lim f,(z) = 0 


* See, for example, L. Bieberbach, Lehrbuch der Funktionentheorie, 2d edition, vol. II, 1931, pp. 
157-158. 


no no 
al 


1933] BOUNDARY VALUES OF ANALYTIC FUNCTIONS 427 


uniformly in every closed subregion of |z| <1 and that the sequence {1/log 
F,,(z)}* converges in measure to 0 on |z| =1 whatever branch of log F,(z) is 
chosen. 

(b) If the sequence {1/log F(z) } converges in measure to 0 on a measurable 
set E of positive measure on || =1, it follows that 
(3.23) lim 1/log f,(z) = 0 
uniformly in every closed subregion of |z|<1 and that the sequence 
{1/log F.(z)} is convergent in measure to 0 on |2| =1, where the branch of 
log F(z) is determined by that of log f,(z) in (3.23). 

The uniformly bounded sequence {f,(z)} forms a normal family. Let 
f(z) be a limit function of the family: f(z) =lim,.. fa,(z). Then f(zo) =0 in 
Case (a), where 2 is a limit point of the sequence { Zn} . Then by a well known 
theorem of Hurwitz, if f(z) 40, f.,(z) must vanish in a neighborhood of 2» for 
all large values of . Since this is not true, f(z) =0. Thus every limit function 
of the family vanishes identically, and (3.22) is proved by an argument 
similar to that used in proving (3.102) in the proof of Lemma 3.1. 

We can suppose that |f,(z)| <1, »=1, 2,---. Define the function 
¢,(z), analytic for |z| <1, by 


log fn(z) + 1 

log fn(z) — 1 
Then | ¢,(z)| <1 and ¢,(z) has the boundary function ©,(z): 
_ log + 1 

log F,(z) — 1 


(3.24) $n(2) 


(3.25) ,,(z) 


Now lim,... ¢:(z) =1 and lim,.., 1/log f,(z) =0 are equivalent statements, so 
the theorem is an immediate consequence of Theorem 3.1. We note a generali- 
zation of (b) corresponding to one of Theorem 3.1 (b) which will be used in 
proving the next theorem. Let E,(e) be the set of those points on |z| =1 for 
which | log F,(z)| >1/€. It is sufficient in (b) if 


lim inf | lim inf mE) | > 0. 


* Choose a branch of log f,(z) at some point of |z | <1 and continue it analytically throughout 
|z | <1, determining a single-valued analytic function which has finite radial boundary values wher- 
ever F,,(z) 0. Since f,(z)40, F,(z)=0 at most on a set of measure 0 by a theorem of F. and M. 
Riesz. Then this branch of log f,(z) has a finite-valued and single-valued boundary function defined 
almost everywhere on Iz | =1 which will be denoted by log F,(z). The function log F,,(z) has infinitely 
many branches differing by integral multiples of 27. 


428 J. L. DOOB [April 


The condition that the sequence {1/log F,(z) } converge in measure to 0 
on |z| =1 is equivalent to 


(3.26) lim B{ | 1/logF,(z)|, E,} = 0, 


for every sequence {£,} of measurable sets on |z| =1 such that 
lim inf,... mE, > 0, 


by Lemma 1.2. In this form the result can be more readily compared with 
that in the next theorem. 

An example of the theorem in which | F,,(z)| =1 except at z=1, F,(1) =0, 
for all values of m is given by 


(3.27) fa(z) = 


Theorem 3.2 enables us to solve a particular case of the problem proposed 
at the beginning of this section which includes the particular case of Theorem 
3.1 obtained by setting |a| =1 in (b). It will be seen that this particular case 
has important applications. 


TuHeoreEM 3.3. Let fi(z), fo(z), - - - be a uniformly bounded sequence of func- 
tions analytic for |z| <1 with Fatou boundary functions F,(z), F(z), - - - re- 
spectively. Suppose that f,(z)#0,n=1,2,---. 

(a) If there is a sequence of points {z,} such that 


(3.301) lim sup | z,| <1, lim f,(z,) = 0, 


n— n— 


it follows that lity. fn(2) =0 uniformly in every closed subregion of |z| <1 and 
that 


| F(z) | E,} 
im = 
ne 1 + Ofarc F,(z), En} 


(3.302) 


for every sequence {E,} of measurable point sets on |z| =1 satisfying 


(3.303) lim inf mE, > 0. 

(b) If there is a sequence of measurable point sets { E,,} on |z| =1 satisfying 
(3.303) and such that every sequence {E,} of measurable point sets on |2| =1 
such that E,¢ €,, n=1, 2,- ++, which satisfies (3.303) also satisfies (3.302), 
then fn(Z) uniformly in every closed subregion of |z| <1. 


* Cf. the first note on p.420. Arc F,,(z) can be defined as the imaginary part of log F,,(z):3 log Fn(s). 
Its oscillation on E, is independent of the branch of log F,,(z) chosen. 


= 1,2,---). 


1933] BOUNDARY VALUES OF ANALYTIC FUNCTIONS 429 


The statement (b) is stronger than the converse of (a). This theorem 
shows what happens if lim,... f.(z) =0 under the above circumstances and if 
the sequence of boundary functions does not converge in measure to 0. The 
condition (3.302) is only slightly stronger than (3.26) as is to be expected. 

(a) Suppose that (3.301) is satisfied. By the previous theorem 


lima... fn(z) = 0 


uniformly in every closed subregion of |z| <1. Unless (3.302) is true, there is © 
a subsequence {F,,(z)}, a positive number \ and a sequence {£,,} of 
measurable point sets on |z| =1 such that lim inf,... mE,,>0 and such 
that 


B{ | Fa,(z) | ’ E,,} 
1+ Of{arc F,,(z), 
Let P.,, be a point of £,, and choose log F,,(z) so that 
(3.305) | Slog Fa,(Pa,)| S (mn =1,2,---). 


Then, since O{arc F,,(z), E.,}<M/d by (3.304), where | F,(z)| <M, n=1, 


(3.306) | Slog Fa,(z)| < 


(3.304) 


(m = 1,2,---). 


on E,,. Now by (3.304), | F.,(z)| 2 on E,,. Then 


(3.307) log F.,(z)| S| logd| +] log M| 


on E,,. But by Theorem 3.2 the sequence {1/log F.,(z) } converges in measure 
to 0 on |z| =1, so this inequality is impossible. 

(b) Suppose that the hypotheses of (b) are satisfied. Determine log F(z) 
from a branch of log f,,(z) for which 


(3.308) | Slog f.(0)| < 


Then it is sufficient to show that the measure of the subset of €, on which 
|log F,(z)| <.K approaches 0 with 1/n for every value of K. For then, by 
Theorem 3.2 (b) in its generalized form, lim,.., 1/log f,(0) =0, which implies, 
by (3.308), that lim,..f,(0)=0. This is sufficient that lim,..f,(z)=0 uni- 
formly in every closed subregion of |z| <1, by Theorem 3.2 (a). Thus if (b) 
were not true there would be a number K, and a subsequence { F,,(z)}, such 
that 


(3.309) | | log F.,(z)| < K 


on a subset £,, of €,,, where lim inf, ... mE,, >0. Then 


(0 = 1,2,---). 


430 J. L. DOOB 


(3.310) | Fa,(z)| 


on £,,, i.e. 
B{ | Fa,(2)|, Eo,} = e-* 


so, by (3.302), 
(3.311) lim Of{arc Fa,(z), = +. 


Then there are points P.,, Q., on E£,, for large values of m such that 
(3.312) | arc Fa,(Po,) — arcFo,(Qo,)| = 4K. 


But then either |log F.,(P.,)| 22K or |log F.,(Qz,)| 22K, which contradicts 
(3.309). The theorem is thus completely proved. 

Now consider the general problem proposed at the beginning of the sec- 
tion. Necessary and sufficient conditions are to be found on the sequence 
(3.02) that the sequence (3.01) converge uniformly to 0 in every closed sub- 
region of |z| <1. The problem has just been solved if f,(z) #0 except for a 
finite number of values of n. The following theorem gives the general solution. 


THEOREM 3.4. Let fi(z), fo(z), - - - be a@ uniformly bounded sequence of func- 
tions analytic for <1, with Fatou boundary functions F,(z), Fr(z), - - 
respectively. Let {An}, {A} be sequences of arcs on |z| =1 such that An, Ad 


have no points in common and such that 


(3.401) lim inf mA, > 0, lim inf mA, > 0. 


Then there exist two sequences of uniformly bounded functions gi(z), g2(z),--- , 
hi(z), - - - analytic for |z| <1 with Fatou boundary functions G,(z), 
G2(z),-- H(z), H(z), - - - respectively such that 


(3.402) = An(2) 


and such that g,(z) #0 in (Ax),* ha(z) in (A,!). A necessary and sufficient 
condition that 


(3.403) lim f,(z) = 0 


uniformly in every closed subregion of |z| <1 is that 


* It will be convenient in this and the following sections if A is an arc of the unit circle, to denote 
by § (A) the interior of the segment bounded by A and by its chord. 


[ApriJ 
no 


1933] BOUNDARY VALUES OF ANALYTIC FUNCTIONS 


B{|G,(z)| , En} B{ | H,(2)|, En } 


* 


(3.404) lim 


no 1+ OfarcG,(z), En} 1+ Ofarc Ha(z), Ex } 


for every pair of sequences {E,}, {E,) } of measurable point sets on |z| =1 such 
that E,c An, cA,,n=1, 2, , and such that 


(3.405) lim inf mE, > 0, lim inf mE > 0. 

We consider only the case in which A, and A,’ are the same arcs A and A’ 
respectively for all values of m. The general case can be proved by a slight 
modification of the following proof. 

Let ¢:(w), ¢2(w) be functions mapping | w| <1 in a one-to-one and con- 
formal way on § (A) and §(A’) respectively. These functions can easily be 
determined explicitly. Let 
be the zeros of f,,(z) in the interior of § (A), where the non-simple zeros appear 
in the list a number of times equal to their multiplicity. By a theorem of 
Blaschket 


(3.407) 


defines a bounded function, analytic for | z| <1, where the product converges 
uniformly in every closed subregion of |z| <1. Define the function g,(z) by 


(3.408) gn(2) = fn(2)/hn(z). 


Then it is readily seen that g,(z) is a bounded function, analytic in the 
interior of the unit circle. The zeros of the functions g,(z), /n(z) have the re- 
quired properties. 

Equation (3.404) is equivalent to the following: 


(3.409) Tim gm [¢:(w) ]- An[2(w)] = 0, 


uniformly in every closed subregion of |w| <1. For suppose that (3.409) is 
true. If (3.404) is not also true there are subsequences {g,,(z)}, {/.,(z)} 
(which we can suppose convergent in |z| <1 since the sequences {g,(z)}, 


* Each branch of arc gn(z) is a single-valued function in § (A,), thus determining a single-valued 
branch of arc G,(z). There are single-valued branches of arc H,(z) on A,’ by the same argument. 

t See, for example, Montel, Legons sur les Familles Normales, Paris, 1927, p. 180. If § (A) 
includes z=0 and if f,(z) has a zero of order \, there, the product is taken from 7=An+1 to @ and 
the factor z*» replaces the first \, factors. 


431 

| = ——a; 

j=1 am 


432 J. L. DOOB [April 


{k,(z)} are normal families) and sets {Z.,}, {Z,,/} satisfying (3.405) such 
that 


Fon} | Ha,(2)|, Eon} 


(3.410) lim inf >0. 


ne 1+ OfarcG,(z), Ofarc H.,(z), Eat} 


Let the point sets on | w| =1 transformed into Z,, and £,/ on |z| =1 by the 
transformations z=¢,(w), z=¢2(w) be €,,, €.,’ respectively. Then it is 
easily seen that 


(3.411) lim inf mE,, > 0, lim inf mE,, > 0. 


no 


Inequality (3.410) is the same as 


|, Eon} | Ha, lo2(w)] |, Een} 
1 + Ofarc Ga, [¢:(w)], E,,} 1 + Ofare Ha,[¢2(w)], 


But this means, by Theorem 3.3, that neither of the (convergent) sequences 
{ ga,{¢1(w) ]}, { ha, [¢2(w) ]} can converge to the function 0, which contradicts 
(3.409). Conversely suppose that (3.404) is true. If (3.409) is not true, there 
are subsequences {g.,[¢:(w)]}, {4.,[¢2(w)]} which are convergent to func- 
tions which do not vanish identically. Then by Theorem 3.3 there are se- 
quences of sets { €.,}, { ©.’ } on |w| =1 such that (3.411) and (3.412) are 
satisfied. Letting the sets {Z,,}, {£./ } on |z| =1 correspond as above to the 
sets { €,,}, { €./}, (3.404) is contradicted. Equations (3.409) and (3.404) are 
thus equivalent. 

Equations (3.409) and (3.403) are also equivalent. For suppose that 
(3.403) is true. We have to show that every limit function of the sequence 
{ gn[i(w) ]- ha, [¢2(w) ]} is the function 0. Suppose that this is not the case. 
Let {g,[¢:(w) ]-ha,[¢2(w)]} be a convergent subsequence, not converging 
to the function 0. We can suppose further that the sequences {g.,[¢:(w)]}, 
{ha,[¢2(w)]} are also convergent, say to g(w), h(w) respectively. Since 
g(w)h(w) 40, g(w) 40, and h(w)40. Then the sequences {g.,(z)}, 
converge in §(A), S(A’) respectively to functions which do not vanish 
identically. By a theorem of Stieltjes* these sequences are convergent 
throughout |z| <1 (to functions which cannot vanish identically). Then the 
sequence {f,,(z)} cannot converge to 0, contrary to the hypothesis that 
(3.403) is true. Conversely suppose that (3.409) is true. We must show that 
(3.403) is true, i.e. that the only limit function of the sequence {f,(z)} is the 
function 0. Suppose {f.,(z)} were a subsequence of {f,(z)} not converging to 


* See for example P. Montel, loc. cit., pp. 28-30. 


1933] BOUNDARY VALUES OF ANALYTIC FUNCTIONS 433 


0. We can suppose that the sequences {g,,(z)}, {/.,(z) } are both convergent. 
They converge to functions not vanishing identically, so the sequences 
{ ga, [¢1(w) ]}, {7a,{¢2(w)]} have the same property, contradicting (3.409). 

It has thus been shown that (3.409), (3.404), and (3.409), (3.403), are 
pairs of equivalent statements, proving the theorem. 


4. THE PROPERTIES OF THE CLUSTER BOUNDARY FUNCTIONS 
OF A SEQUENCE OF MEROMORPHIC FUNCTIONS 
Let 


(4.01) filz), folz), 


be a sequence of functions meromorphic for |z| <1, with cluster boundary 
functions 


(4.02) F(z), F2(z),--- 


respectively, as defined in §2. The problem to be attacked in this section is 
that of finding the relations between these two sequences. 


4.1. Let fi(z), fo(z), - - be sequence of functions meromorphic 
for |z| <1. Let the cluster sets of the sequence in |z| <1 and on |z| =1 be s and 
S respectively. Then if there is a point a belonging to s but not to S, no point of 
the domain D containing a and bounded only by points of S* is omitted by the 
sequence. 


This theorem is proved by an application of the maximum principle for 
analytic functions which is fairly obvious, so the proof will be omitted. The 
theorem is stated only to allow ready comparison with Theorem 4.2, the 
principal result of this section. To prove Theorem 4.2, which generalizes 
Theorem 4.1, we need a succession of lemmas. 


Lemma 4.1. Let {f,(z)} be @ uniformly bounded sequence of functions 
analytic for |z| <1. Let {A,} be a sequence of arcs on |z| =1, 


lim inf,... mA, > 0, 
and let the cluster set of the sequence on |z| =1 on these arcs be S. Then if there is 
a point a omitted by the sequence, not belonging to S, and such that 


(4.11) lim f,(0) = a, 


every point except a of the domain D containing a and bounded only by points of 
S is assumed by the sequence. 
* The frontier points of a point set are the points every neighborhood of which contains a point 


both of the set and of its complement. If every frontier point of a domain belongs to a point set S, 
the domain will be said to be bounded only by points of S. 


434 J. L. DOOB [April 


The fact that, in the case considered, a subset of S necessarily bounds a 
finite domain is not surprising in view of the information given by Theorem 
3.3 about the oscillation of arc [F,(z) —a] (where F,,(z) is the Fatou boundary 
function of f,(z)). The theorem will be proved first under the hypothesis that 
is continuous on Ap. 

Suppose that 6 ~a were a point of D not assumed by the sequence. Then 
there would be a subsequence {f,,(z)} omitting the value 8. Let D’ be a do- 
main which, together with all its frontier points, is contained in D, and which 
contains the points a and §. Then f,,(z) on A,, is outside D’ for large values 
of n. Now if ¥(w) is defined by 


(4.12) ¥(w) = Slog (==) = arc (w — a) — arc (w — 8), 


it is seen that ¥(w) is single-valued and continuous in the complementary 
set of D’, choosing that branch for which ¥(«)=0. Then ¥(w) must be 
bounded in the complement of D’. In particular, there is a number M such 
that 

(4.13) w[fa,(z)]| = | arc [fa,(2) — — arc [f.,(2) — 8]| M 

for z on A,, and for values of m so large that &,,(z) is outside D’ on A.,:n2N. 
Then if E,, ¢ Aq, is a measurable point set on |z| =1, 

(4.14) | Ofarc [f.,(z) — a], — Ofarc [f.,(z) — 8], Eo,} | 2M,n=N. 
Now by Theorem 3.3, if lim inf, ... mE., >0, 


4.15) B{| fa,(2) — «|, 


1 + Ofarc [fa,(2) — a], 


This implies that 
(4.16) lim Ofare [fa,(z) — a], = 


since lim inf,... B{ | f.,(2) —a|, E.,} is not less than the minimum of the dis- 
tance from a to a point of S, which is positive. But (4.16) implies, by (4.14), 
that 


B 
(4.17) lim Bi | fa, (2) B| 
1 + Ofarc [f.,(z) — 8], Ea,} 
since the denominator becomes infinite while the numerator is bounded uni- 


formly for all values of nm. By Theorem 3.3, (4.17) implies that limn... f.,(0) 
=8, which is impossible since by hypothesis lim,.. fa,(0) =a. The hy- 


1933] BOUNDARY VALUES OF ANALYTIC FUNCTIONS 435 


pothesis that 6a was a point of D not assumed by the sequence {f,(z) } 
has thus led to a contradiction. The proof will now be given without the 
restriction that f,(z) be continuous on A,. Let 7,(z) be the cluster boundary 
function of f,(z) on |z| =1. Let the arc A,’ have the same midpoint as A, but 
be of half the length. Let-4,’ be that arc on |z| =r,<1 (where r, will be 
determined below), which is cut off on |z| =r, by the sector of the unit circle 
intercepting A,’. Then r, can be so chosen that 1—r,<1/m and that if 3 is 
any point on¢4,’, there is a point z on A, such that 


(4.18) | Fa(z) — fa(s)| S 


for some determination of ¥,(z) at z. Now consider the sequence {f,(r,2) }. 
This sequence evidently has a subset S’ of S as a cluster set on |z| =1 on the 
arcs {A,/}. The function f,(r,z) is continuous on A,’, the sequence omits the 
value a (not belonging to S’) and lim,.., f.(0) =a. Then by what has been 
proved already, the sequence assumes every value except a in the domain 
D’ containing a and bounded only by points of S’. Since S’¢ S, D’ > D, and 
the lemma is proved. 


Lemma 4.2. Let {fn(z)} be a sequence of functions meromorphic for |z| <1. 
Let {A,} be a sequence of arcs on |z| =1, and let the cluster set of the sequence 
in |z| <1 with respect to these arcs and on |z| =1 om these arcs be s and S, 
respectively. Then if there is a point a belonging to s but not to S and if a is 
omitted by the sequence {f,(z)}, there is at most one other point in the domain 
D containing a and bounded only by points of S whichis omitted by the sequence. 
If there are two points in D which are omitted by the sequence, no other point 
of the extended plane can be omitted by the sequence. 


(a) By hypothesis there is a subsequence {f,,(z)} and a sequence {z,,} 
of points in |z| <1, such that (2.05) is true and such that if A., is trans- 
formed into A,’ by (2.09), (2.10) is true. We shall suppose that f,(z) ¥a, 
except for a finite number of values of , that 
(4.21) lim f,(0) = a, 


no 


and that 
(4.22) lim inf mA, > 0. 


If this is not true already, we can use the sequence {g,(z) }, where 


— Za, 


(4.23) = j.(—*), 


2a,2— 1 


n— co 


436 J. L. DOOB [April 


in conjunction with the arcs {B,} where B,=A,,’. In this form the connec- 
tion between this and the previous lemma is obvious. 

(b) Suppose that besides a, a, b, c are also values omitted by the sequence 
{f.(2)}, where a, a, b, c are supposed distinct and where a, b, c do not neces- 
sarily belong to D. We can suppose that a, b, c are 0, 1, © respectively. For 
if this were not so we should prove the corresponding theorem for the se- 
quence {,(z)}, which has 0, 1, © as exceptional points: 


fn(z) — ac — b 
f(z) -be-—a 


if a, b, c are all finite. If one of the points is ©, we can suppose that it is the 
point c, and we define /,,(z) by 


(4.24) h,(z) = 


(4.25) 


Let £=),(£’) be a single-valued analytic function mapping | ¢’| <1 on 
the extended é-plane less the points 0, 1, © (the elliptic modular function 
defined in the circle instead of in the half-plane). The function \; (¢’) maps 
| ’| <1 in a one-to-one way on an infinitely many sheeted Riemann surface 
with branch points at 0, 1, «©. Let \() be the inverse of \:(¢’), and form the 
sequence {¢,(z)} where 
(4.26) n(2) = ALfn(2)]. 

Since f,(z) omits the values 0, 1, ©, ¢,(z) can be taken as any one of an 
infinite set of single-valued analytic functions defined in | z| <1, by the mono- 


dromic theorem, and |¢,(z)| <1. Choose some determination of \(a):a’. 
Then a branch of ¢,(z) can be chosen for each value of m so that 


(4.27) lim $,(0) = a’. 


Since f,(z) a, ¢a(z) a’. Let the cluster set of the sequence {¢,(z)} on 
|z| =1 on the arcs {A,} be S’. If £’ is a cluster value of the function ¢,(z) 
at a point of A,, and if | ¢’| <1, :(¢’) is a cluster value of f,(z) at that point 
of A,. Then if £’ is a point of S’ and if | ¢’| <1, \:(€’) isa point of S. Then a’ 
cannot belong to S’ or a would belong to S, since | a’| <1. Let D’ be the do- 
main containing a’ and bounded only by points of S’. Then by Lemma 4.1 
every point in D’ except a’ is assumed by the sequence {¢,(z)}. 

Now suppose that 8 is a point of D, Ba, 0,1, «©. Let J be a Jordan arc 
joining a to 6 and lying wholly in D. It can be so chosen that it does not pass 
through 0, 1 or ©. Choose the branch of \(¢) for which \(a) =a’ and using 


| 
= 
h 
n(2) 
a 
b 


1933] BOUNDARY VALUES OF ANALYTIC FUNCTIONS 437 


that branch determine J’, the image of J in the £’-plane. Then we shall prove 
that J’ lies wholly in D’. One end point, a’, belongs to D’. If J’ is not wholly 
in D’, there is a point of J’ on the boundary S’ of D’. We can suppose that 
t’ is the first such point on J’, tracing J’ from a’. If | £’| <1, £=.(¢’) belongs 
to S, as was noted above. But ¢ is on J, belongs to D, and so cannot belong to 
S. If | ¢’| =1, the arc J must spiral infinitely often about 0, 1, or ©, since it 
remains at positive distance from the first two and remains in some circle 
about the origin. But this is impossible, since J is a Jordan arc. Then no 
point on J’ can be on the boundary of D’, so J’ lies entirely in D’. This means 
that B’=(8) belongs to D’ and is therefore assumed by the sequence 
{on(z)}. Then is assumed by the sequence {f,(z)}. 

It has thus been proved that if three points a, b, c are exceptional, besides 
a, every point of D save a and the points of a, b, c belonging to D, is assumed 
by the sequence (a subsequence of the original sequence). 

(c) The result of (b) will now be sharpened. Suppose that 6a belongs 
to D and is an exceptional value of the sequence {f,(z)} for which (4.21) 
and (4.22) are true. Suppose that is a third exceptional value of the se- 
quence, not necessarily in D. We can suppose that a=0, B=, y=1. Con- 
sider the sequence { ¥n(z) } where 


(4.28) ¥n(z) = 


Choosing any one of three branches, y,(z) is single-valued and analytic in 
|z| <1, by the monodromic theorem, since f,(z) 0, 0. Moreover the se- 
quence {y,(z) } has the exceptional values 


a =0, fi b= c= 1, 
and 


(4.29) lim ¥,(0) = 0. 


Let S; be the cluster set of the sequence {y,(z)} on |z| =1 on the arcs { A,}. 
If £, is a point of Si, = &,* is a point of S. The point a, =0 therefore does not 
belong to S; or a=0 would belong to S. Let D, be the domain containing a; 
and bounded only by points of S;. It will be shown that D, contains (;. Let 
J be a Jordan arc in D with end points a=0, 8 = ©.* A point £ can be chosen 
on J, so near a=0 that each determination of £/* lies in D;—since D, includes 
some neighborhood of the origin. Then continue one determination of £/* 
from £ to 8 along J, thus determining a Jordan arc J;, which we shall prove 
lies entirely in D,. For if it did not, a point of J; would also be a point of Si, 


* This means that to J corresponds a Jordan arc on the sphere. 


438 J. L. DOOB [April 


the boundary of D,, which would imply that a point of J was a point of S. 
Since this is not true, 6: = © must be a point of D,. Now the sequence {y,(z) } 
omits three values a, b, c besides a1, so that every point in D, (not one of these 
values) must be assumed by the sequence, as was proved in (b). Therefore 
6:= 0 must be assumed. The sequence {f,(z)} must therefore assume the 
value 8 = ©, contrary to hypothesis. 

It has thus been shown that if there is an exceptional value in D besides a, 
no other point in the extended plane is omitted by the sequence. This proves 
the lemma. 

Lemma 4.3. Let A be an arc of |z| =1 and let f(z) be a bounded function, 
analytic for z in the segment § (A) of the unit circle, with Fatou boundary func- 
tion F(z) on A. Suppose that 


(4.31) | f(z)| < K in §(A) 

and that 

(4.32) |F(z)| < 

Then if k’>k, 

(4.33) | f(z)| < k’ in§(A’) 

where A’ c A and where mA’ is a function of mA, k'/k, K/k only, 
(4.34) mA! = +(mA, k'/k, K/k), 


7 >0 increasing with mA, k'/k, k/K. 
This lemma is easily proved using the maximum principle.* 


Lemma 4.4. Let {fn(z)} be a sequence of functions meromorphic for |z| <1. 
Let the cluster set and the strong cluster set of the sequence in |z| <1 with respect 
to a set of arcs {An} on |2| =1 be s, s: respectively, and let the cluster set of the 
sequence on |z| =1 on these arcs be S. 

(a) Sees. 

(b) The points of s not belonging to S form an open sett consisting of non- 
overlapping domains every one of which has at least one frontier point belonging 
to S. 

(c) If one point a of one of these domains belongs to s,, every point of the 
domain containing a and bounded only by points of S belongs to sy. 


* Cf. for example the proof of a related fact obtained by W. Seidel, these Transactions, vol. 34 


(1932), pp. 3-4. 
t This set may be empty. 


~ 


1933] BOUNDARY VALUES OF ANALYTIC FUNCTIONS 439 


(a) Let P belong to S. By hypothesis there is a sequence of points 
{P.,}, where P,, is a point of A.,, such that lim... %.,(P.,) =P, Fa,(P.,) re- 
presenting one of the values of 7.,(z), the cluster boundary function of f,,(z), 
at P,,. We can suppose that lim inf,.., mA., >0. There is a sequence of points 
in |z| <1 approaching P.,, on which f,,(z) approaches ¥,,(P.,). Then let 2, 
be one of these points so close to P,, that 


| Fa,(Po,) — fan (Za,) 


<1/n 
and that 


transforms A,, into an arc of length not less than 2r—1/n. The existence of 
the sequence {z,,} is the condition that P belong to s:. Then S¢si, and, by 
definition, si: ¢s. 

(b) The first part of (b) is equivalent to the statement that the frontier 
points of s which belong to s also belong to S. Suppose the contrary, that there 
is a point a, a frontier point of s belonging to s but not to S. We can suppose 
that @ is finite, substituting the sequence {1/f,(z)} for {f.(z)} if a=. 
Making, if necessary, linear transformations taking |z| <1 into itself for all 
values of n, we can suppose that 


(4.41) lim f2,(0) = @ 
and that 
(4.42) lim inf mA,, > 0. 


The sequence {f,,(z)} is normal. For otherwise there would be a point zo 
with the property that in any neighborhood of z) at most two values are 
omitted by the sequence.* This would mean that every value of the plane 
belonged to s, contradicting the fact that a@ is a frontier point of s. If there 
were a limit function f(z) 4a, f(0) =a necessarily and f(z) would assume every 
value in some neighborhood of a, for |z| <3. Then the cluster set of {f.,(z) } 
in |z| <3, which is a subset of s, would include this neighborhood of a con- 
trary to the hypothesis that a was a frontier point of s. The sequence {f,,(z) } 
is thus a normal family with the single limit function a, which implies, by 
an argument similar to that used in the proof of Lemma 3.1, that 
(4.43) lim fo,(z) = @ 


uniformly in every closed subregion of | z| <1. 


* P. Montel, Lecons sur les Familles Normales, Paris, 1927,,p. 126. 


Za,2 — 1 


440 J. L. DOOB [April 


Let every point of S be at distance greater than 3d >0 from a. Then 
(4.44) | Fa,(z) — a| > 2d on Aa, 


for large values of n:w=WN, and every determination of 7,,(z). Since a is a 
frontier point of s there is a point 8 not belonging to s such that 


(4.45) |a—B| <d. 
Then 
(4.46) | F.(2) -8|>d on Ag, n= N. 


There is no sequence of points { £,} such that &, belongs to $(A»,) and 
such that limn..f>,(,) —8=0, where {f;,(z)} is a subsequence of {f,,(z)}. 
For then 8 would have to be a point of s, contrary to hypothesis. Then there 
is a number K with the property that 


(4.47) 1/| fa,(2) — S Kin §(Aa,) 


for large values of m: n= N,. But then if d’ <d, by the previous lemma there 
is a sequence of arcs {A,’}, Aa’ ¢ such that lim inf,... >0 and 
such that 


(4.48) 1/| fo,(2) — B| 1/d’ in §(A,,), n= N, Mi. 
Then by (4.3) 
(4.49) 1/|a— 1/d’. 


Since d’ was arbitrary, d’<d, (4.49) implies that |a—8| =d which contra- 
dicts (4.45). The hypothesis that there was a frontier point of s belonging to s 
but not to S has thus led to a contradiction. 

The points common to s and the complement of S thus form an open set. 
This set is the sum of non-overlapping domains. At least one frontier point of 
each domain belongs to S. For if D is one of these domains and if a is a point 
of D, we can suppose that (4.41) and (4.42) are true. Consider the set £ of all 
limit values of sequences of the form {f.,(.,) } where £4, is a point of |z| <1 
on the radius from z=0 to Q,,, the midpoint of A,,. This set Z, a subset of s, 
is readily seen to be closed and connected and to contain a@ and also at least 
one point of S (in fact one of the limit values of the sequence { 7.,(Q,,)}). By 
a well known theorem, since both a point in D and a point not in D belong to 
E, a frontier point of D must belong to E. We shall prove that this point P 
belongs to S, thus completing the proof of (b). If P did not belong to S, it 
would be a frontier point of s which belonged to s, since Ec¢s. This is im- 
possible by what has been proved already. Then P belongs to S. 

(c) Statement (c) is equivalent to the statement that the frontier points 


1933] BOUNDARY VALUES OF ANALYTIC FUNCTIONS 441 


of s, (which belong to s; since s; is closed) are points of S. The proof is similar 
to that of (b). 

We now combine Lemma 4.4 with Lemma 4.2 to get the final result of 
this section. 


TuHeEoreM 4.2. Let {f,(z)} be a sequence of functions meromor phic for | 2| <1. 
Let the cluster set and the strong cluster set of the sequence in |z| <1 with respect 
to a set of arcs {A} on |z| =1 bes and s, respectively and let S be the cluster set 
of the sequence on |z| =1 on these arcs. Let there be a point x belonging to s but 
not to S. 

(a) Suppose that no point of the domain D containing a and bounded only 
by points of S belongs to s,. Then if one point of the set s-D, consisting of non- 
overlapping domains each with at least one frontier point belonging to S, is 
omitted by the sequence {f,(z)}, only one other point of the extended plane can 
be omitted and every point of the extended plane belongs to s. 

(b) If a point of s: is in D, De sy, and at most two points of D are omitted 
by the sequence {f,(z)}. If two points of D are omitted no other point of the 
extended plane is omitted and every point of the extended plane belongs to s. 


(a) In (a) if a point a of s-D, which was described in Lemma 4.4 (b), is 
exceptional to the sequence {f,(z)}, a cannot be a convergence value of the 
sequence with respect to the arcs { A,,}, or a would belong to s; (cf. §2). Then 
if we suppose, as we can, that a subsequence {f,,(z)} exists for which 


+«fa,(0) =a, lim inf,... >0, 


the sequence {f,,(z)} cannot be normal or a would be a convergence value 
by an argument used in the proof of Theorem 3.2 (a). Then by a theorem 
used above there must be a point in |z| <1 in every neighborhood of which 
the sequence {f,,(z)} can have at most two exceptional values, i.e. one 
besides a. This proves (a) completely. 

(b) If a point e@ of s, is in D, D¢s,¢s by Lemma 4.4 (c). Then any ex- 
ceptional value of the sequence in D is an exceptional cluster value with re- 
spect to the arcs {A,}, and Lemma 4.2 can be applied. There only remains 
the proof that if two points of D are exceptional, s is the entire extended 
plane. We can suppose that there is a subsequence {f,,(z) } such that 


limy.<fa,(0) =a, and lim,.. mMAq, 


We consider f,,(z) for m so large that mA,, =32/2 considering the values of 
fa,(z) for z in the interior of a segment §(A,/) determined by a subarc 
A! of Ag, of length 37/2. Then it is immediate that if 6 is arbitrary except 
that @ is not one of the two values in D exceptional to the sequence {f,(z)} 
by hypothesis, f,,(z)-8=0 has a root in §(A./) for an infinite set of 


442 J. L. DOOB [April 


values of (proof by mapping §(A./) on a new unit circle and applying 
Lemma 4.2). Then 8 is a point of s, as was to be proved. 


5. THE NEIGHBORHOOD PROPERTIES OF THE BOUNDARY FUNCTION 
OF A BOUNDED ANALYTIC FUNCTION 


Let f(z) be a bounded analytic function, defined for |z| <1, with Fatou 
boundary function F(z). We shall discuss the following two questions. Let 
Pe be a point on |z| =1. What are necessary and sufficient conditions on 
F(z) in a neighborhood of P that F(z) be defined at P: lim,.. f(re) =F(P)? 
What are necessary and sufficient conditions on F(z) in a neighborhood of P 
that f(z) have the cluster value a at P? The latter case can be divided into 
two parts, according as a is or is not a non-tangential cluster value. The most 
stress in this section will be laid on conditions which are both necessary and 
sufficient and for this reason and for reasons of simplicity the sufficient con- 
ditions will not be stated with the full generality possible. 


THEOREM 5.1. Let f(z) be a bounded function analytic for |2| <1 with Fatou 
boundary function F(z), | F(z)| <1. Let P be a point on |z| =1. 

(a) If lim..p |f(z)| =1 when z approaches P on a continuous curve C, 
lying on one side of some chord through P, | F(z)| is approximately continuous 
at P on that side, if | F(P)| is defined as 1. In particular, if C is a non-tangential 
path, lim. .p | f(z)| =1 when z approaches P on every non-tangential path and 
| F(z)| is approximately continuous at P if | F(P)| is defined as 1. 

(b) If |f(z)| has the cluster value 1 at P, E(|F(z)| >1—¢)* is metrically 
dense at P for all positive values of ¢. If 1 is a cluster value on some non-tan- 
gential path it is a cluster value on every continuous non-tangential or tangential 
curve to P and | F(z)| is quasi-approximately continuous at P with limit value 
1 there. 


(a) It is convenient to prove the second part of (a) first. Define f,(é), 
analytic in the upper half-plane with boundary function F;(£) by 


1+ it 1+ i 
(5.101) filé) = F,(g) = =) 


Using Lemma 1.1, we see that it is sufficient to prove the result corresponding 
to the second part of (a) for fi(€) and its boundary function F;(£). We can 
suppose that P is the point |z| =1. Non-tangential paths to a point of the 


* If a function F(z) is defined almost everywhere on |z|=1, it will be convenient to denote the 
set of points on |z | =1 at which F(z) satisfies a given inequality by E( ), where the inequality is 
enclosed by the parentheses. 


ha 


1933] BOUNDARY VALUES OF ANALYTIC FUNCTIONS 443 


real axis are defined as paths which remain within some angle with vertex at 
the point whose sides are rays in the half-plane under consideration. 

By hypothesis, then, lim;.o |f:(£)] =1, when £ approaches £=0 on Ci, a 
non-tangential continuous curve. C; is included in the angle determined by 
two rays, L’, L’’, meeting at O:=0. We can suppose that L’, L’’ are sym- 
metric in the imaginary axis. Let R, be the interior of the rectangle having 
one side, of length X, on the real axis and opposite side with end points on L’ 
and L’’. The rectangle is symmetric in the imaginary axis. Let P, be the 
intersection of the diagonals of R, and let ¢(w) be the function mapping 
|w| <1 in a one-to-one and conformal way on Ri, so that ¢(0) =P; and so 
that ¢’(0) is real and positive.* We consider the family {g,(w)} where 
g.(w) =fi[Ao(w) ], O<AS1. The function g,(w) takes on those values in 
| w| <1 which f,(¢) takes on in R,. Let G,(w) be the Fatou boundary function 
of g,(w). 

(i) lim,.o | g,(w)| =1 uniformly in every closed subregion of | w| <1. For 
there is a value of p<1 such that | w| =p corresponds to a simple closed ana- 
lytic curve J in R; (by means of the transformation £=¢(w)) which inter- 
sects both L’ and L’’ and therefore C,.{ Then for each positive value of A, 
there is a point w, || =p, such that A¢(w,) is a point of C:. Then 


(5.102) lim ler(m)|=1, =p <4, 


and Lemma 3.1 proves the desired statement. { 

(ii) It follows from (i) that lim;.o |f:(¢)| =1 uniformly in the angle con- 
sidered. 

(iii) It follows from (5.102), by Lemma 3.1, that the family {|G (w)| } 
converges in measure to 1 (A—0). If E,(€) is the set of those points on | w| =1 
for which | G,(w)| <1—., E,(€) corresponds to a set Ey (€) on Rj by the trans- 
formation £=\¢(w) which is continuous on |w| =1. The set Ey (e) consists 
of those points on the perimeter of R, for which | Fi(¢)|<1—e. The set 
E,(¢) corresponds to a set of measure mEy (e€)/X on R; by the transformation 
&£=(w). We have 


* The function ¢(w) can be given by means of elliptic integrals: see for example W. F. Osgood, 
Lehrbuch der Funktionentheorie, 5th edition, 1928, p. 437. 

t This follows from the fact that if {w,} is a sequence of points in |w|<1 and if 

| =1, 

the sequence {¢(w»)} has no limit point in R,, for which see, for example, L. Bieberbach, Lehrbuch 
der Funktionentheorie, 2d edition, vol. 2, 1931, pp. 24-25. 

t The theorems of the preceding section can be stated for a family of functions depending on a 
parameter running through a continuous set of values to a limiting value just as well as for a family 
depending on the parameter » taking on only integral values. 


444 J. L. DOOB [April 


(5.103) lim mE,(e) = 0, 


which implies that 
(5.104) lim (6)/d = 0*. 


4-0 
But the measure of the set €,(€) of those points on the real axis on the interval 
| £| <d/2 for which | Fi(£)| <1—€ being less than mE; (e), it follows that 


(5.105) lim m€,(e)/rX = 0, 


i.e. that &(€) has density 0 at =0. Since «>0 was arbitrary, it has been 
proved that | F,(£)| is approximately continuous at £=0, if |Fi(0)| is de- 
fined as 1. 

To prove the first part of the statement (a) of the theorem we change the 
domain of definition of the function again. Define f2(£) analytic in the first 
quadrant, with boundary function F.(£), by 


(5.106) = =) = Pf =) 


By the corollary to Lemma 1.1 it is sufficient to prove the results desired for 
the function f2(£) and its boundary function F2() (we suppose that P is the 
point z=1, and the point in question in the é-plane is then the origin). Sup- 
pose then that lim;.o | fe(¢)| =1 when £ approaches O: =0 on a continuous 
curve C; which lies in the angle formed by a ray ZL in the first quadrant with 
one of the rays which bound the first quadrant. We can suppose that C; lies 
in the angle between L and the (positive) real axis. Let the angle between L 
and the positive real axis be @ and let L’ be a ray through O into the first 
quadrant making an angle 0’>0, 0’<7/2, with the positive real axis. Let 
P, be a point of C2 and let R, be the rectangle whose diagonals L/, Lj’ 
intersect at P,, and which has two vertices Q;, Q2 on the real axis. We suppose 
Ri so chosen that Lj | |Z’. Then Q:, Q2 are both on the positive real axis; take 
OQ:>O0Q2. The ray through Q; parallel to L{’ must meet C; in at least one 
point. Let P2 be the intersection of the ray and C2 which is nearest Q2, and 
let Re be the rectangle whose diagonals Lj, Ly’ intersect at P2, Li||Li, 
Li'||L{’, and with vertices Qz, Q; on the real axis. The point Q; must be on 
the positive real axis since L/||Z’. In this way we get a sequence of rec- 
tangles Ri, Re, - - - whose diagonals intersect at Pi, Ps, - - - and a sequence 


* This can be proved most readily using the fact that ¢(w) on |w | =1 is continuous and has a 
continuous derivative except at four points in the neighborhood of which ¢(w) has the same character 
as w'/? at w=0. 


| 


1933] BOUNDARY VALUES OF ANALYTIC FUNCTIONS 445 


of vertices Q1, Qe, - - - on the positive real axis, OQ: >OQ, - - - . The mono- 
tone sequence {Q,} has a unique limiting point which must be Q, for if it 
were not O, the sequence {P,} would have as unique limit that same point 
on the positive real axis, which is impossible since P,, is on C2 for ali values of 
n. Let I, be the closed interval with end points Q,, Q,4:. Then it is easily 
seen that 
mI, 2 tan 
(5.107) = < = 
OOn+1 tan 6’ — tané 

Let ¢,(w) map |w| <1 in a one-to-one and conformal way on the interior of 
R, so that ¢,(0) =P, ¢,/ (0) >0. Then 


M. 


mI, 
(5.108) on (w) = —— $} (w). 
ml; 


? 


Consider the sequence {f2[¢.(w)]}. Then 
(5.109) | folon(w)]| <1, lim | = 1. 


Therefore, by Lemma 3.1, the sequence of the absolute values of the boun- 
dary functions is convergent in measure to 1 on |z| =1. Repeating an argu- 
ment used above, if E(¢) is the set of those points on the positive real axis 
for which | F2(£)| <1—e, e>0, 


no mI, 


Now let J be a variable closed interval with one end point at O and the other 
on the positive real axis. Let \=A(J) be the smallest value of j for which Q; 
belongs to J. Then 


j=r-1 j=n+1 
and 
(5.112) m[I-E(e)] >> m[I;-E(0]. 
j=r-1 


Choose 6>0 so that 


m|I;- E(e)] 


< for >X—1, if mI 
ml ; 1+M 


(5.113) 


for some fixed positive number 7. Then 


446 J. L. DOOB [April 


m|I-E(e)] _ + n 


(5.114) S >. 
Since 7 was an arbitrary positive number, | F2()| must be approximately 
continuous at O on the positive real axis, if | F2(0)| is defined as 1, as was to 
be proved. 

(b) If | f(z)| has the cluster value 1 at P (which we suppose to be the point 
z=1) we go again to the function fi(£) defined by (5.101). We have a se- 
quence of points {P,}, P,O:£=0, such that 


(5.115) lim | fi(Pn)| = 1. 


Let R, be a rectangle with one side on the positive real axis and whose diag- 
onals intersect at P;. Let R,, m>1, be the rectangle with one side on the real 
axis, whose diagonals are parallel to those of R: and intersect at P,. The 
rectangles are all similar and by considering fi(£) defined in R,, we find by 
reasoning similar to that used above that there is a sequence of intervals on 
the real axis (the bases of the rectangles) such that, with self-explanatory 
notation, 


m1, E(| Fi(é)| 1 — ©) 


no mI, 


0, 


which shows that E(|F:(£)| =1—€) is metrically dense at §=0 for all values 
of e>0, implying the same for E(| F(z)| 21—.¢) at P on |z| =1. 

If the sequence { £,} is non-tangential, the first proof given in (a) can be 
used, choosing a suitable sequence from the family {g,(w)}, to show that 1 
is a cluster value of | f(z)| on every continuous non-tangential path to P and 
that | F(z)| is quasi-approximately continuous at P with limit value 1 there. 
There remains the proof that 1 is a cluster value on a continuous curve C 
which is tangent to |z| =1 at P. It is not difficult to reduce this to the results 
already proved, by the use of conformal mapping, and the details will not be 
given. If C is an arc of a circle, the preceding results show that | f(z)| will be 
even quasi-approximately continuous on C at P, with limit value 1 there. 

THeEorEM 5.2. Let f(z) be a bounded function analytic for |2| <1 with Fatou 
boundary function F(z), | F(z)| <1. Let P be a point on |z| =1. 

(a) If F(z) is defined at P and if | F(P)| =1, F(z) is approximately con- 
tinuous at P.* 


* The converse is empty, strictly speaking, since in the definition of approximate continuity at 
P, F(z) is supposed defined at P; cf. however Theorem 3 of the previous paper referred to above, and 
a note below. 


2 
| 


1933] BOUNDARY VALUES OF ANALYTIC FUNCTIONS 447 


(b) A necessary and sufficient condition that f(z) have the cluster value a 
at P, if |a| =1, is that E(| F(z) —a| < €) be metrically dense at P for all positive 
values of €. A necessary and sufficient condition that f(z) have a as a non-tan- 
gential cluster value at P if |a| =1 is that F(z) be quasi-approximately contin- 
uous at P with limit value a there. If a is a non-tangential cluster value at P, 
|a| =1, it is a cluster value on every continuous non-tangential or tangential 
path to P. 

(c) In (a) if w=f(z), for 2 on some continuous non-tangential curve to P, 
defines a curve in the w-plane more closely tangent to |w| =1 at w=F(P) than 
any circle C, of radius p<1, in |w| <1 and tangent to | w| =1 at w=F(P), the 
same is true for the curve defined by w=f(z) when 2 is on any non-tangential 
curve to P and the metric density of the set E, of those points at which F(z) is 
outside C, is 1 for all values of p<1. In (b) if ais a cluster value at P: 

(5.21) lim f(zn) =a, limz, = P, | a| = 1, 


and if the sequence {wn}, Wn=f(2n), is more closely tangent to |w| =1 at w=a 
than any circle C,, p<1, the set E, is metrically dense at P for all values of p<. 
If the sequence {zn} in (5.21) is non-tangential and if the sequence {w,} has 
the same property as above, E, has upper mean metric density 1 at P for all values 
of p<. 


The statements (a), (b) can be proved by an argument similar to that in 
the previous theorem, referring the result back to Theorem 3.1. The neces- 
sary conditions are simply Theorem 5.1 applied to 


elf (2) laj—1, 


We note that there are not two cases in (a) as there were in Theorem 5.1 
because by a theorem of Lindeléf,* if f(z) has a unique limit on a continuous 
curve to P, f(z) will have that same limit on every non-tangential path. 
The sufficient conditions are independent of the fact that |a| =1 (cf. 
Theorem 3.1). 
The statement (c) can be deduced from the corollary to Theorem 3.1 or 
(a), (b) of Theorem 5.1 can be applied to the function 


This result is a sort of complement to a well known theorem of Julia, Wolff 
and Carathéodory.{ 


* E. Lindeléf, Acta Societatis Scientiarum Fennicae, vol. 46 (1915), No. 4, p. 10. 

+ By using the criterion for convergence of a sequence given in the generalization of Theorem 
3.1 (b), we can deduce again Theorem 3 of the previous paper, referred to above. 

t See for example L. Bieberbach, Lehrbuch der Funktionentheorie, 2d edition, vol. 2, 1931, pp. 
112-121. 


no 


448 J. L. DOOB [April 


THEOREM 5.3. Let f(z) be a bounded function, analytic for | z| <1 with Fatou 
boundary function F(z). Suppose that f(z) a in some neighborhood of a point 
P on |z| =1. 

(a) A necessary and sufficient condition that 

lim 1/log [f(z) — a] = 0* 


when z approaches P on continuous non-tangential paths is that 1/log [F(z) —a] 
be approximately continuous at P if 1/log [F(P)—a] is defined as 0. 

(b) A necessary and sufficient condition that 1/log [f(z)—a] have the 
cluster value 0 at P is that E{|log [F(z)—a]| =>M} be metrically dense at P 
for all values of M. A necessary and sufficient condition that 1/log [f(z)—a] 
have the non-tangential cluster value 0 at P is that 1/log [F(z)—a] be quasi- 
approximately continuous at P with limit value 0 there. If 0 is a non-tangential 
cluster value it ts a cluster value on every continuous non-tangential or tangential 
path to P. 


This theorem can be deduced by means of Theorem 3.2 or by applying 
Theorem 5.2 to the function 
log f(z) + 1 
log f(z) — 1 
where we suppose that a=0 and that | f(z)| <1 (cf. the proof of Theorem 3.2). 
Tueorem 5.4. Let f(z) be a bounded function analytic for |z| <1, with Fatou 
boundary function F(z). Suppose that f(z) ~a in some neighborhood of a point 
P on |2| =1. 
(a) A necessary and sufficient condition that F(z) be defined at P and that 
F(P)=a is that 


B{| F(z) — En} 
(5.41) lim oe 
1 + Of{arc [F(z) al, E,} 
for every sequence {E,} of measurable point sets on |z| =1 with the property that 
there exists a sequence of arcs {A,} on |z| =1 with midpoint P such that 


meE,, 
—>0. 


(5.42) E, ¢ Ay, lim mA, = 0, lim inf 
n— no =MAy 
(b) A necessary and sufficient condition that f(z) have was a cluster value at 
P is that there be a sequence of arcs {An} on |z| =1 whose end points approach 
P with the property that 


* Cf. the note on p. 427. 


1933] BOUNDARY VALUES OF ANALYTIC FUNCTIONS 449 


B{| F(z) — E-A,} 


0 


= + Ofarc [F(z) a], E-A,} 


for every measurable set E such that 


mE-A, 
> 0. 


(5.44) lim inf 
no MAn 

(c) In (b) the conditions are necessary and sufficient that f(z) have a asa 

non-tangential cluster value if the arcs {A,} are all taken to have the midpoint P. 


The corresponding statement for f(z) defined in a half-plane is obvious, 
and it is convenient to prove it in this case. This is equivalent to proving the 
theorem as stated, as is shown by a slight extension of Lemma 1.1. The 
theorem is then easily deduced from Theorem 3.3 by considering f(z) (defined 
in the half-plane 3(z)>0) in suitable rectangles with bases on the real axis. 
The discussion is analogous to that in the proof of Theorem 5.1. 


CoroLiaRy. Theorem 5.3 gives necessary and sufficient conditions (a) that 
F(z) be defined at P, (b) that f(z) have a given cluster value at P, (c) that f(z) 
have a given non-tangential cluster value at P, if f(z) is supposed univalent.* 


For a univalent function f(z) defined for |z| <1 does not assume a cluster 
value at a point P on |z| =1 in some neighborhood of P. 

The problem set at the beginning of the section has thus been solved in a 
special case. If f(z) is bounded and analytic in the interior of the unit circle, 
with Fatou boundary function F(z), necessary and sufficient conditions have 
been found on F(z) in a neighborhood of a point P on |z| =1 that F(z) be 
defined. at P, and that f(z) have the cluster value a at P, if f(z)#F(P), 
f(z) #a, respectively, in some neighborhood of P. In the first case the con- 
ditions need be modified only slightly if f(z) = F(P) only at points of |z| <1 on 
one side of some chord through P; we need only consider F(z) on one side of 
P on |z| =1. In both cases, by using Theorem 3.4, the general case can be 
solved, but the statement becomes so complicated that it is of no interest. 


6. THE NEIGHBORHOOD PROPERTIES OF THE CLUSTER BOUNDARY 
FUNCTION OF A MEROMORPHIC FUNCTION 


Let f(z) be a function meromorphic for |z| <1, with cluster boundary 
function 7(z).t Let P be a point on |z| =1. What are the relations between 
f(z) and F(z) in a neighborhood of P? A partial answer to this has been given 
in §5, since if f(z) is bounded, the value of its Fatou boundary function at a 


* A function f(z) is called univalent if f(z:) unless 2; =2:. 
t Cf. §2. 


450 J. L. DOOB [April 


point P on |z| =1 where it is defined is also one of the values of ¥(P). It will 
be seen that the results of this section are generalizations of the following 
theorem. 


THEOREM 6.1. Let f(z) be meromorphic for |z| <1, and let S be the sum of 
the cluster sets of f(z) in |z| <1 at all the points of |z| =1. If f(z) takes on a value 
a not in S, f(z) assumes every value in the domain containing a and bounded only 
by points of S.* 


Let f(z) =f(z), w=1, 2, - - - . Then this theorem is simply Theorem 4.1 
for the sequence {f,(z)}. A simple direct proof is the following. The set of 
values s assumed by f(z) in |z| <1 is open. If s does not contain the domain 
D considered, there is a frontier point P of s in D. The point P is a limit point 
of assumed points: 

lim f(zn) = P 


for some sequence {z,} in |z| <1. Since P is not assumed in |2| <1, 
lim,..|2n| =1, 


and so P belongs to S, contrary to the hypothesis that P was in D. 

The theorem corresponding to Theorem 4.2 in this development is an 
important theorem first proved by W. Gross and F. Iversen. This theorem 
can be proved most easily directly,t although the greater part can be proved 
without difficulty by the methods of this paper. The following is a generaliza- 
tion of the Gross-Iversen theorem. 


THEOREM 6.2. Let f(z) be meromorphic in y: 3(z)>0, and let the cluster 
set of f(z) on the real axis at P:z=0 on a given point set E be denoted by S(E). 
Let a be a point of the cluster set s of f(z) in y at P: 


(6. 21) lim z, = 0, lim f(g.) = a, 


n— 


and let 2,=*n+iy, where x, and y, are real. 


* This theorem, a generalization of a well known theorem of Darboux, was obtained recently 
by Persidskij, Bulletin de la Société Physico-Mathématique de Kazan, vol. 3, No. 4 (1931), pp. 89- 
91. 

} W. Gross, Monatshefte fiir Mathematik und Physik, vol. 29 (1918), pp. 3-47; Mathematische 
Zeitschrift, vol. 2 (1918), pp. 242-294. 

F. Iversen, Recherches sur les Fonctions Inverses des Fonctions Méromor phes, Thesis, Helsingfors, 
1914; Oversikt av Finska Vetenskaps-Societetens Férhandlingar, vol. 58 (1915-1916), Section A, 
No. 25; ibid., vol. 64 (1921-1922), Section A, No. 4. 

t Cf. a paper by the author in the Annals of Mathematics, (2), vol. 33 (1932), pp. 753-757. 


: 
| 


1933] BOUNDARY VALUES OF ANALYTIC FUNCTIONS 451 


(a) Suppose that the sequence {2,} is tangential: 


Let R, be the interior of a rectangle one of whose sides is on the real axis, and 
whose diagonals intersect at 21. Let Rn, n>1, be the interior of a rectangle, one of 
whose sides is on the real axis and whose diagonals, parallel to those of R,, inter- 
sect at X,+-inn where nx is chosen so that 


noo no Nn 
Let Ey be the set of all those points on a base of R, for some value of n= N. Then 
if a does not belong to S: the product of all the sets {S(E,)}, f(z) assumes in R, 
each value in the domain D containing a and bounded only by points of S, for all 
except perhaps a finite set of values of n, with two possible exceptions; if there are 
two exceptions, they are the only ones in the extended plane for f(z) in the rec- 
tangles {R,}. 

(b) Let R, be the interior of a rectangle one of whose sides is on the real axis 
and whose diagonals intersect at 2. Let Rn, n>1, be the interior of the rec- 
tangle one of whose sides is on the real axis and whose diagonals, parallel to those 
of Ry, intersect at zn. Suppose that a is omitted by f(z) in the rectangles {R,}. 
Let En be the set of all those points on a base of R, for some value of n= N. Then 
if a does not belong to S:the product of all the sets {S(E,)}, f(z) assumes in R,, 
each value in the domain D containing a and bounded only by points of S for all 
except perhaps a finite set of values of n, with one possible exception, besides a. 
If there is one other such exceptional value, it is the only other one in the extended 
plane for f(z) in the rectangles {R,}. 


In (a) the sets S(Z,) are identical for large values of n. If the bases of the 
rectangles {R,} do not cover the origin in (b), S(Z,) can be used in place of 
S, for any value of m. The results (a) and (b) are consequences of Theorem 
4.2 (a) and (c) respectively, applied to f(z) defined in the rectangles de- 
scribed, and their proof presents no difficulty. 


CoLuMBIA UNIVERSITY, 
New York, N. Y. 


lim = 0. 
Xn 


ON FINITE-ROWED SYSTEMS OF LINEAR 
INEQUALITIES IN INFINITELY MANY 
VARIABLES. II* 


BY 
I, J. SCHOENBERG 


1. Introduction. In a previous paper on the same subjectf a certain class 
of systems of linear inequalities in infinitely many variables were solved and 
applications to the theory of completely monotonic functions were derived 
from a particular type of such systems which were called Hausdorff systems. 
The present paper gives an extension of those results to a similar class of 
systems of linear inequalities involving a double sequence of variables (§2). 
This extension has already been performed by T. H. Hildebrandt and myself 
in the very particular case of completely monotonic double sequences.{ 

In §4 Hausdorff systems involving a double sequence of variables are 
solved and applied to extend the results of F. Hausdorff, S. Bernstein, and 
D. V. Widder (see I, §11) to completely monotonic functions of two var- 
iables. The results of §3 (minimal solutions, minimal representations of solu- 
tions) though not absolutely necessary for the applications made in §5, help 
to present them in a more elegant manner. 

2. On acertain class of systems of linear inequalities for a double sequence 
of variables. Let 


Go2 boa bo3--- 
Ge bu bi bis: 


be two given infinite matrices of real numbers. Let 
(i:, i2,---, i) =| asa], li, 8 


Throughout this paper we shall suppose that 


* Presented to the Society, August 31, 1932; received by the editors August 2, 1932. 

tI. J. Schoenberg, On finite-rowed systems of linear inequalities in infinitely many variables, 
these Transactions, vol. 34, pp. 594-619. In the text this paper will be designated by the symbol I. 

t See Theorem 1 of T. H. Hildebrandt and I. J. Schoenberg, On linear functional operations and 
the moment problem for a finite interval in one or several dimensions, to appear in the Annals of Mathe- 
matics. We shall frequently use the results of §3 of this paper and refer to it by the symbol HS. 


452 


SYSTEMS OF LINEAR INEQUALITIES 453 


(i1, t2,--+, > 0, i2,---, >0 


(2.2) . 
(OSi<ie<--- 7 =1,2,3,---). 
Let 
Em 
a “ee a 
Emtk Omtk,1 * 
Nn Dah 
= (k, h,m,n = 0,1,2,---) 
Math 
where D, Em and Do =n. 
Let pmn(m, m=0, 1, 2, - - - ) be a double sequence of real variables. Both 
expressions 


Df (De'umn); 


in which the operator D# applies to the subscript m and the operator D,* 
applies to the subscript , are linear homogeneous combinations of the 
(k+1)(h+1) variables h’=0,1,---, h). Writ- 
ing out these expressions it is readily found that the coefficients of pmiz,n+n° 
in both expressions are equal to the product of the same two cofactors: 


Di Em X 


Em+k’ Nn+h’ 


D 


Hence 
Di (Di = = DE Dz umn. 
We shall be concerned with the problem of solving the system of linear in- 
equalities 
(2.4) 20 hy m,n = 0,1,2,---). 


Without essentially restricting this problem, we shall suppose for con- 
venience that 


= by = 1 (¢ = 0,1,2,---). 
As in I, Part II, and in HS, §4, we shall first solve the finite system 
(2.5) Dt Duman = 0 (k+ms p) 


454 I. J. SCHOENBERG [April 


involving the (+1)? variables pnn (m, n=0, 1,---, ~). From the two 
identities (see I, p. 602) 
k 
(m+ 1,---,m+k+1) 
4 (m,---,m + k) 
(m+1,---,m+k-+1) 
[n+1,---,n+h] 
DED? timn = DED? 
h 
[n n + h] 
[n+1,---,n+h+1] 
and (2.2) it follows that (2.5) is equivalent to its partial system 


(2.6) = O (m,n =0,1,---, p), 


Dt my 


Dt ntl 


which we shall now solve. 
Let us define two new operators 


Of 
[n,sti,---,n+h] 


[s+1,---,n+ h][s,---,a+ h] 

From the fact that the two linear transformations I(7.4) and I(7.6) are in- 
verse to each other, it follows that the two linear transformations 


(2.7) D?-"Em, En (m 0, i, p) 


are inverse to each other and also that the same thing is true for the trans- 
formations 


= 


(2:8) fin = Nn = (n = 0, 1,---, 
Let us now consider the linear transformation 
(2.9) Ppmn = D?-"D?-" (m, n= 0, 1, p). 


From (2.7) and (2.8) we derive successively 


Hence 


(2.10) Hmn = OP-"02?-"pomn (m,n = 0,1,---, p) 


is the linear transformation inverse to (2.9): The system (2.10) gives, for 
Ppmn 20, the most general solution of (2.6) and hence of (2.5). 


| 
| 


1933] SYSTEMS OF LINEAR INEQUALITIES 455 


The explicit form of (2.10) is 
(m,r+1,---, p) [n,s+1,---,p] 
Introducing the quantities 
(0,7 + 1,---, 9) [O,s+1,---, pl] 
(r,---, +1,---, 9) [s,---, p][s+1,---, 9] 


= 


(2.11) = 


Pprs 
we get 


Kmn = 


(m,r +1,---,p) [m,s+1,---,p] 
Nore 
remo (0,7 + 1,---, 2) [(0,s+1,---,9] 
which we write 
(2.12) = (m,n = 0, 
r ,e=0 
where 
(m,r+1,---, p) [n,s+1,---, 2] 
(0,7 +1,---, 9) [(0,s+1,---, 9] 
(m, 1, n,s =0,1,--+, Cmpp = (m)/(0) = 1; = [n]/ [0] = 1). 


(2.13) Cmrp = 


Let 
(2.14) Xpr = Cirp, Vor = disp (r, s=0,1,---, p). 


Asin I, §8, let «=P? (x) (p =m) denote the polygonal line in the plane (x, x), 
joining the points Cmrp) (r=0, 1, - - - , p); similarly let (p2=m) 
be the polygonal line in the plane (y, v), joining the points (yp, dnep) (s =0, 
1, ie p). 
It has been proved in I, §8, that 
(2.15) lim P,{?) (x) = @m(x), lim Q,{?) (y) Vn(y) (m,n = 0,1, 2,---) 

hold uniformly in x and y respectively, in the interval (0, 1); moreover 
¢m(x) and y,(y) are the sequences of functions associated with the matrices 
A and B by Theorem 8.1 of I. It is shown there that the ¢(«) are continuous, 
non-decreasing, convex, and 


go(x) 1, = $n+1(0) = 0, ¢n(1) =1 (0 1; 0, 1, 2, ). 


The y,(y) have the same properties. 


456 I. J. SCHOENBERG [April 


From Theorem 8.1 of I we know that the most general solutions of the 
two systems of linear inequalities 


(2.16) Di tn 2 0 (k, m = 0,1, 2,---), 2 0 (h,m = 0,1, 2,---) 


are given respectively by 


Em 


1 


f (m = 0,1,2,--+), 


(2.17) 


1 
(n = 0,1,2,---), 
where x:(x) and x2(y) are monotonic in (0, 1). The same theorem says that 
the monotonic function x:(x) is essentially uniquely defined by the first set 
of equations (2.17) if and only if every function f(x) which is continuous on 
(0, 1) can be uniformly approximated as close as we want by linear combina- 
tions of functions of the sequence {¢,(x)}. A sequence of continuous func- 
tions {¢,(x)} with this property shall be called a base of continuous functions 
on (0, 1). The same definition will be used for functions of several variables. 
For convenience we introduce the following 


DEFINITION 2.1. The first system (2.16) shall be called a determining system 
if and only if the corresponding sequence {¢m(x)} is a base of continuous func- 
tions on (0, 1). Otherwise it shall be called a non-determining system. 

The following theorem is readily proved. 

THEOREM 2.1. (1) If the solutions of the systems (2.16), as given by Theorem 
8.1 of I, are (2.17), then the most general solution of the system 


(2.18) Df = (k, h, m,n = 0,1,2,---) 


may be expressed in the form 

1 1 
2.19 an = n(y¥)d2dyx(x, ,n = 0,1, 2,--- 


with x(x, y) monotonic in the sense of Hardy and Krause (see HS, §3), and 
conversely, (2.19) always represents a solution of (2.18). 
(2) A necessary and sufficient condition that the function x(x, y) be uniquely 
defined by the set (2.19) and the additional conditions 
(2.20) x(0, 0) = x(x, 0) = x(0, y) = 0, x(x, ») = x(e + 0, y + 0), 


is that both systems (2.16) shall be determining systems, in which case also (2.18) 
shall be called a determining system. 


| 

| 


1933] SYSTEMS OF LINEAR INEQUALITIES 457 


Let pmn(m, n=0, 1, 2, - - - ) be a solution of (2.18). Then (2.5) holds for 
every value of p and therefore also all the consequences derived therefrom. 

Let us define in the unit-square U OS y<1) a step-function 
Xp(x, y) as follows: 


(a) Xp(*, 0) = xp(0, y) = 0; 
(b) xp(%pr + 0, Ype + 0) — xp(Xpr + 0, — 0) — Xp(Xpr — 0, + 0) 
+ — 0, — 0) = Nore (r,s = 0, p);* 


(c) xp(x, y) is constant in each of the rectangles 
Xpr SX < < < 
and also on each of the line segments 
< Y= 1; B= 1, < Y < 


moreover xp(x, vy) =xp(x +0, y+0) for 0<x<1,0<y<1. 
From (2.11), (2.9), (2.6) and (2.12) (for m=n=0) we conclude that 


Pp 
(2.21) Apre 20 (7,5 =0,1,---, p), Dorpre = Moo, 


which shows that {x,(x, y)} is a sequence of uniformly bounded monotonic 
functions in U. On the other hand, from (2.12) we derive for p= max (m, n) 


‘ 1 pl 
> (Yps)Xpre = Pf?) (y)d cdyxp(x, y) 


,e=0 


f f Gm(X)Wnly)d dyx p(x, + €pmn; 
0 0 


where €pmn—0 as p>, on account of the uniform convergence in (2.15) 
and the uniform boundedness of the x,(x, y). From a theorem of J. Radon, 
we know that there is a subsequence x, of x, converging in U to a monotonic 
function x. From the same lemma and our last relation we derive (2.19) as 


p=qr@. 


* xp(1+0, yp.+0) means xp(1, ¥ps+0); xp(O—0, vps+0) means xp(0, yp,+0), etc. 

{ This theorem, which is an extension to functions of two variables of a well known theorem of 
Helly, says: If {xp(x, y)} is a sequence of functions which are uniformly bounded and uniformly of 
bounded variation in U, then there is a subsequence {x¢(x, y)} converging everywhere in U to a 
function x(x, y) of bounded variation in U. Moreover, for every f(x, y) continuous in UV 


1 1 1 1 
lim f f 9) = f f 9). 
0 0 0 


J. Radon, Sitzungsberichte der Wiener Akademie, vol. 122 IIa (1913), pp. 1337-1342, and vol, 128 
IIa (1919), pp. 1092-1094, proved this theorem in a slightly weaker form, which, however, would 
also suffice for our purpose. For the present statement see HS, §3, Lemma 1. 


458 I. J. SCHOENBERG [April 


Conversely, let x in (2.19) be a monotonic function. We have to show 
that (2.19) represents a solution of (2.18). Indeed 


1 1 
D¥D} umn = f f Dt 2dyx(x, y) 
0 0 


= 9) & 0, 


since D¥ }n(x) 20, D*W,(y) 20, for any x and y on (0, 1). 

The second part of Theorem 2.1 follows readily from an extension to two 
variables of a theorem of F. Riesz.* From this theorem it follows that x(x, y) 
is uniquely defined by (2.19) and (2.20) if and only if {¢n(x)Wn(y)} is a base 
of continuous functions in U. However, it is readily shown that {¢m(x)Wn(y) } 
is a base of continuous functions in U if and only if both sequences {¢n(x) } 
and {y,(y)} are such bases in (0, 1). For if both sequences {¢n(x)}, 
{Wn(y)} are bases, then every polynomial P(x, y) can be uniformly approxi- 
mated by expressions of the form }>m,n Ymn¢m(X)Wa(y), hence also any con- 
tinuous f(x, y). Conversely, if this is true for any f(x, y), then in particular 
for any continuous f(x) we have 


f(x) + p(x, y) (| p| < €) 


throughout U. An integration over (0, 1) with respect to y shows that {¢n(x) } 
is a base in (0, 1). This completes the proof of Theorem 2.1. 

3. Minimal solutions. We have so far solved completely the following 
three systems of linear inequalities: 


(3.1) Di tm 2 0 (k,m = 0,1,2,---), 
(3.2) 2 0 (h,n = 0, 1, 2,°-*), 
(3.3) Df D2'tmn = 0 (k, h, m,n = 0, 1, 2,°°*); 


and their most general solutions were found to be 


* See F. Riesz, loc. cit. in I, §1. The extended theorem says: If dnn(x, y) (m, n=0,1,2,°- + )is 
a double sequence of continuous functions in U, then a function x(x, y) monotonic in U is uniquely 
defined by the set of equations 


1 1 
= f f dmn(x, y)dzdyx(x, y) (m, 0, 1, 2, ) 
0 0 


and the conditions (2.20), if and only if {@mn(x, y)} is a base of continuous functions in U. There 
seems to be no proof of this theorem in the literature. However, the proof for the case of one variable 
given by W. Seidel, Annals of Mathematics, (2), vol. 32 (1931), pp. 777-784, can be extended immedi- 
ately to prove the theorem just stated, if one applies Lemma 1 of HS, §3. The same theorem may 
also be derived from general results concerning linear metric spaces. See I. J. Schoenberg and W. Sei- 
del, On linear operations in linear metric spaces, to appear in these Transactions. 


1933] SYSTEMS OF LINEAR INEQUALITIES 459 


(3.1’) f (m = 0,1, 2,---) 
(x1(0) = 0, x1(x) = xi(x + 0) for 0 < x < 1), 
1 
(3.2) Nn -f ¥n(y)dxe(y) (n 0, 1, 2, ) 
0 
(x2(0) = 0, x2(y) = x2(y + 0) for 0 < y < 1), 
1 1 
(3.3) Kmn = f f 2dyx(x, y) (m, % = 0, 1, 2, ) 
0 0 


(x(x, y) satisfies (2.20)), 


respectively, where x:(x), x2(y) and x(x, y) are monotonic. 
If £,, is a solution of (3.1), then also &,-+(0)"y (y>0) is such a solution.* 
We shall need the following 


DEFINITION 3.1. A solution &m of (3.1) is called a minimal solutions if there 
is no other solution &,, of (3.1) and a constant y >0 such that 
Em = Em + (0)™y (m = 0,1,2,---). 
We prove now 


THEOREM 3.1. Let (3.1) be a determining system.t Its solution ém given by 
(3.1’) is a minimal solution if and only if the monotonic function x,(x) is con- 
tinuous at x=0. 


The condition x:(0) =x:(+0) is necessary for £m to be a minimal solution 
of (3.1). For let us suppose that 0= x:(0) <x:(+0), and let us define the 
function x10(x) =xi(«) +(0)*xi(+0) which is continuous at the origin. Then 


f f + (0)™x(+ 0) 


shows that £,, is no minimal solution of (3.1). 


* We define (0)"=0 for m>0, (0)°=1. Note that ¢,,(0)=(0)™, ¢,(0)=(0)". 

+ Hausdorff has already called attention to the distinction between minimal and non-minimal 
completely monotonic sequences. The name minimal is due to D. V. Widder, these Transactions. 
vol. 33 (1931), p. 880. 

} That the condition of this theorem is not always sufficient to insure that £, is a minimal solu- 
tion of a non-determining system is shown by the following example. Let (3.1) be a non-determining 
Hausdorff system of the type considered in Theorem 10.2 of I. Let ao=0, hence ¢oo=¢n=co= 

++ =1. Take x(x) =x for OS x1(x) for S231. This function is continuous through- 
out (0, 1). However, the solution given by (3.1’) is readily found to be fo=¢n/2+-¢en/2, :=cn?/2, 
- =Qand this is a non-minimal solution since ¢1>0, and ¢11/2, ¢?/2, 0,0,0,--- is the 
solution of (3.1) given by Theorem 10.2 of I, for *** =0, 


460 I. J. SCHOENBERG [April 


To show the sufficiency of our condition let us prove that 
1 
=f with x3(0) = 0) 
0 


is a minimal solution. Indeed, if £,, were not a minimal solution, then we 
should have 


f f dmdxa(x) + (0) ™y 
0 0 


with x: monotonic, xi(0) =0,andy >0. The function = x:(x) + [1—(0)#]y 
is monotonic and 


1 1 
f omdx1(x) f (m 0, 1, 2, ) 
0 0 


which is impossible, since (3.1) is a determining system, while 
0 = x:(0) = X1u(0) = xi(+ 0) < 0) = xi(+ 0) + 


For a more thorough investigation of the solutions of the system (3.3) we 
shall need the following 
Lema 3.1. Let x(x; y) be monotonic in U and satisfy the conditions (2.20). If 
we define a new function xo(x, y) as follows: 
xo(x, y) = x(x, y) forO0 <x 1,0 < y S 1; x0(0, 0) = x(+ 0, + 0); 


3.4 
xo(*, 0) = x(x, + 0) for 0 < x 1; x0(0, y) = x(+ 0, y) forO< x81; 


then the solution (3.3') of (3.3) may be written in the form 


‘ f f ed yxo(x, 9) 

(3.5) 
0)" m(x)d , 0 

+ (0) J oat xo(x, 0) 


1 
+ (0)" f Ya(9)dx0(0, ¥) (0)™*"x0(0, 0). 
0 


This follows readily from the definition of the double Stieltjes integral. 
We know that umn is the limit of the following expression (see HS, §3; here 
we take with £0 = =0) 


| 

4 

| 


1933] SYSTEMS OF LINEAR INEQUALITIES 461 


i=0 j=0 

p-1 


i=0 j=0 


+ (0)* [xo( 0) — xo(xs, 0) 
i=0 


q—1 
+ (0)™ [xo(0, yi+1) —x0(0, yi) ]+(0) mtny (0, 0), 


j=0 


as p—*, g—, and all subintervals tend to zero. The last identity follows 
from (3.4). Passing to the limit, it goes over into (3.5) which is thus proved. 


Lema 3.2. If umn is a solution of (3.3) then also 
(3 .6) Emn = + (0)"Em + (0) + (0) 


where Em and nn are solutions of (3.1) and (3.2) respectively, and y =0, is a solu- 
tion of (3.3). 


Let 
(3.7) =f + = 0) = 0, 2 0), 
0 
(3.8) = f Wn(y)dxe(y) + (0)*v2 (x2(0) = x2(+ 0) = 0, v2 2 0). 
0 
With xo(x, y) defined by (3.4), we define two new functions 
(3.9) Xo(x, y) = xo(x, y) + x(x) + xely) 7 
and 
(3.10) x(x, y) = xo(x, y), x(x, 0) = x(0, y) = x(0, 0) = 0, 


fo O< 51,0< 


Then x(x, y) is a monotonic function of which the corresponding function 
given by (3.4) is precisely xo(x, y). From (3.6), (3.5), (3.7) and (3.8) we then 
derive 


‘d 
M 


462 I. J. SCHOENBERG [April 
1 1 1 “s 
f f od 9) + (0) f dndxo(x, 0) 
0 0 0 
1 
+ (0)" f Ynd%Xo(0, ¥) + (0)™**X0(0, 0) 
0 


1 1 
f f 9). 
0 0 


Hence fim» is a solution of (3.3). 

Our last lemma justifies the following 

DEFINITION 3.2. A solution mn of (3.3) is called a minimal solution if there 
is no other solution jimn of (3.3), as well as two solutions &m and nn of (3.1) and 
(3.2), and a constant y=0, with &o>+no+y>0, such that (3.6) shall hold for 
m,n=0,1,2,---. 

A first criterion for minimal solutions of (3.3) is given by 

Lemma 3.3. A solution umn of the determining system (3.3) is a minimal 
solution if and only if both sequences timo and pon are minimal solutions of the 
corresponding systems (3.1) and (3.2), in which case mn 1s, for any fixed value 
of n, @ minimal solution of (3.1), and similarly, for any fixed value of m, a 
minimal solution of (3.2). 


If pmn Of (3.5) isa minimal solution of (3.3), then necessarily 
(3.11) xo(x, 0) = x0(0, y) = x0(0, 0) = 0, 


otherwise the representation (3.5) would contradict the assumption that 
Hmn is minimal. We therefore have x(x, y)=xo(x, y). Integrating by parts* 
we get 


which shows that mo is a minimal solution of the determining system (3.1) 
because x(+0, 1) =x0(0, 1) =x(0, 1) =0 (Theorem 3.1). A similar proof shows 
that won is a minimal solution of (3.2). 

Suppose now that un» is not a minimal solution of (3.3). Then 


Hmn = + (0)"Em + (0) + (0)™*"y, 


with o+no+y>0. One of the quantities +7, no+y is >0. Suppose that 
not+y >0. For n=0 we derive 


* See E. W. Hobson, Functions of a Real Variable, vol. 1, 3d edition, 1927, §448. 


4) | 


1933] SYSTEMS OF LINEAR INEQUALITIES 463 


Hmo = mo + Em + (0)™(no + 7), 


and hence pmo is no minimal solution of (3.1). 
Suppose again that uma. is a minimal solution of (3.3). Writing 


X(x, = f f Val y)dadyx(x, y), 
0 0 
we get 


f m(x)d x(x, 1), 


while 
z 1 1 
xX n d n dy ’ 
X(x, 1) Jf 9) x(2, 9) 


is obviously continuous at x =0. We conclude again from Theorem 3.1 that 
mn is a minimal solution of (3.1), for » fixed. 
From this lemma we derive the following 


THEOREM 3.2. The solution (3.3’) of the determining system (3.3) is a mini- 
mal solution of this system if and only if x(x, y) is continuous as a function of 
(x, y) along the two sides of the unit-square U which meet at the origin. 


For convenience, we denote by L those two sides of U. If umn is a minimal 
solution of (3.3), then (3.11) holds and this obviously implies the continuity 
of x(x, y) along L. 

Conversely, if x(x, y) is continuous along L, then 


1 1 1 
f f f 1) 
0 0 


is a minimal solution of the determining system (3.1), since x(+0, 1) =0 
(Theorem 3.1). Similarly yo, is a minimal solution of (3.2). It therefore follows 
from Lemma 3.3 that pm is a minimal solution of (3.3) and the theorem is 
proved. 

Of importance is the following 


Lemma 3.4. Every solution tmn of the determining system (3.3) may be ex- 
pressed in the form 


(3.12) Hmn = + (0)"Em + (0)"nn + (0)™*"y, 


where [mn, §m, Nn are minimal solutions of the systems (3.3), (3.1), (3.2), and 
y 20. This representation is unique and shall be called the minimal representa- 
tion of the solution pmn of the system (3.3). 


464 I. J. SCHOENBERG [April 


Let our solution yu». be given in the form (3.5). Introducing the new func- 
tion 
(3.13) xoo(x, y) xo(*, y) xo(x, 0) xo(0, y) + xo(0, 0), 


from (3.5) we derive 


1 1 
f f ed yxoo(x, 9) 
(3.14) 
+ (0) f dm(x)dxo(x, 0) + (0)™ f Yaly)dx0(0, y) + (0) ™*x0(0, 0). 
0 0 


This is a representation of the type (3.12). Indeed, xo(x, 0) and xo(0, y) are 
both continuous at the origin and xo(0, 0)=0. Moreover, the function 
Xo0(x, y) defined by (3.13) is readily found to be continuous along L and van- 
ishing on ZL. From Theorems 3.1 and 3.2 we infer the truth of our last state- 
ment. 

To prove the uniqueness of (3.12), let 
= + (0)"Em + + (0) 

= Amn + (0)"Em + (0) + 
be two representations of the type (3.12). For any particular value of »>0, 
we get 


(3.15) 


(3.16) Emn = Emn (m = 2, 3,°--). 
Since Zmn,Z (m=0,1,2,---) are both minimal solutions of (3.1) (Lemma 
3.3), (3.16) must hold also for m=0. A similar argument applied to the sub- 
script m, shows that (3.16) holds whenever m+n>0. Then necessarily 
Zoo Since both and are minimal solutions of (3.3). From (3.15) 
we now derive 
(0)"Em + (0) "mn + = (0)"Em + + (0)™*"y’, 

from which, for »=0, m>0, we get £n=&, which holds also for m=0. 
Similarly 7,=7, , and finally for m =n =0, we obtain y =y’. 

A consequence is the following 

THEOREM 3.3. Every solution mn of the determining system (3.3) may be 
represented as follows: 


1 1 
= Om(x)Wn(y)d 2dyxo0(x, y) 


1 1 
0 0 
(m,n = 0,1, 2,--- 29), 
where xoo(%, ¥), x1("), x2(y) are monotonic and satisfy the conditions 


| 
| 
| 


1933] SYSTEMS OF LINEAR INEQUALITIES 465 


xo0(0, 0) = xo0(x, 0) = xo0(0, y) = 0, xoo(x, y) = xoo(x + 0, y + 0) 
(3.18) 
Xo0(x, y) is continuous along L(OS x 51, y=0;x=0,05 yS 1); 


x1(0) = xi(+ 0) = 0, = + 0) for O< x <1, 

x2(0) = x2(+ 0) = 0, x2(y) = xe(y + 0) for O<y <1. 

This is a minimal representation of the solution jim, and is unique in the sense 
that the three monotonic functions xoo(x, v), x:(x), x2(y) and the constant y are 
uniquely defined by (3.17), (3.18), and (3.19). The solution mn is minimal if 
and only if x:(x)=x2(y)=7=0. 

From the minimal representation (3.12) and Theorems 3.1 and 3.2 we im- 
mediately derive (3.17). The uniqueness of (3.17) follows from the uniqueness 
of a minimal representation (Lemma 3.4) and from the fact that our systems 
(3.1), (3.2) and (3.3) are determining systems. 

From (3.3’) we derive (3.17) by means of (3.4), (3.13) and 


(3.20) = xo(x, 0) — x0(0, 0), = xo(0, y) — xo(0, 0), y = x0(0, 0). 


4. Hausdorff systems for double sequences. The system of linear ine- 
qualities (3.3) is called a Hausdorff system if both systems (3.1) and (3.2) are 
Hausdorff systems, that is to say (see I, §9), when both matrices A and B of 
(2.1) are of the Vandermondean type: 


(3.19) 


1 a--:- 1 bf 

(4.1) A=|11 B=] 1 be 
| 

with 


We shall invariably suppose that 
(4.3) 


whenever a,>+, or b,»+ 0, and discuss the following three further pos- 
sibilities: 


(4.4) lima=+0, limb=+, 
rate 0 

(4.5) lim a, = a, lim 6, = B (a,B << +), 

(4.6) lim a, = a, lim b, = +0 (a << +0). 


roto 


466 I. J. SCHOENBERG [April 


As we know from I, §§9-10, only the case (4.4) leads to a determining system 
(3.3). Moreover, in this case 


According to Theorem 3.3, a minimal solution of the system (3.3) (with 
(4.1), (4.2), (4.3), and (4.4)) is given by 


1 1 
= ff 2, 9) 
0 0 


where xo0(%, y) satisfies the conditions (3.18). Writing 
Xo0( = x(x, y), 
we get 


1 1 
(4.8) Lan = f f dix(x, y) (m,n = 0,1, 2,---), 
0 


where x(x, y) satisfies the conditions (3.18) and is uniquely defined by (4.8) 
and (3.18). 
Let us define on U —Z the function 


1 1 
4.9) O< 51,0<y¥8 0). 


Then, since x(x, y) is continuous along L, 


1 1 1 1 
= lim f f x(x, y) = lim f f xem yond y) 


«0 
and hence 
1 1 
(4.10) f atm yond yw(x, y) (m, n=0,1,2,--- 
0 0 
where the integrals are improper and converge in the sense that lim... f'/? 


exists. 
Conversely let @(x, y) be a function defined on U —L, with the properties 


a(x, 1) = a(1, =OforO0< x S51,0< 
y) a(x", 9), a(x, a(x, 9”), 
(4.11) x”, y”, x’, 9’) = a(x”, y”) — a(x”, y’) — a(2’, 
+ a(x’, y’) 2 0, 
forO0< <2” S1,0<y< 


| 

1 

id 
i4 
a 


1933] SYSTEMS OF LINEAR INEQUALITIES 467 


and such that in the sense that lim... [2 /? exists, 


1 1 
(4.12) = ff 9) = 
0 0 
Let 
z y 
4.13) = ff 9) in U, 
0 0 


from which we derive 
1 1 
(4.14) a(x, y) f x .d,x(x, y) in U — L, 
z 
and 
1 1 
(4.15) >= f f (x, y) (m, 0, 1, 2, ). 
0 0 


From (4.8), (4.15), and a theorem* of C. A. Fischer it follows that 
x(x, y) = x(x, y) 


in all the points of U, except possibly for a set of points lying on two denumer- 
able sets of vertical and horizontal line segments 


Vi: <1; =1, 2,3,---), 
Hj: y=n,0Sx851 (0<n; <1; 7 = 1, 2,3, ---). 
From (4.9) and (4.14) it follows that 

w(x, y) = @ (x, y) 


(4.16) 


in all the points of U—L outside the segments (4.16). 
A consequence of these results, of Lemma 3.4, and of Corollary 9.1 of I, 
is the following 


* The theorem referred to is equivalent to the following statement: If gmn(x, y) is a base of con- 
tinuous functions in U, and x(x, y), x(x, y) are two functions of bounded variation in U (both 
vanishing on Z), then 


1 1 
f f &mn(x, y)dzdyx(x, 
0 0 


1 1 
= f f &mn(x, y)dzdyx(x, 9) (m, 0, 1, 2, ), 
0 0 


if and only if x(x, y)=x(x, y) throughout U, except possibly for a set of points contained in two sets 
of line segments of the type (4.16). See C. A. Fischer, Annals of Mathematics, (2), vol. 19(1917-18), 
pp. 39-40, and HS, §3, Lemma 2. 


4 
4 


468 I. J. SCHOENBERG [April 


THEOREM 4.1. Every solution pimn of the determining Hausdorff system (3.3) 
derived from the matrices (4.1), whose elements satisfy (4.2), (4.3), and (4.4), 
admits the following minimal representation: 


1 1 1 
mn = bad 1 
f f ximyind sd yo(x, 9) + (0) f ximdpx(2) 


1 
+(0)" + (m,n = 0, 1,2,---; 720), 


where w(x, y), defined on U—L, satisfies the conditions (4.11) and 
(4.18) w(x, y) = w(x —0,y- 0) (O< x <1,0< y <1), 
while p;(x) and p2(y) are monotonic on 0<x <1 and0<ySX1, respectively, with 


pi(1) = po(1) = 0, pi(x) = pi(x — 0), poly) = p2(y — 0) 


(4.19) 


The integrals in (4.17) are improper and convergent in the sense that lim..o Jif? 
and lim..o [! respectively exist. The functions w(x, y), pi(x), p2(y) and the con- 
stant y are uniquely defined by (4.17) and all their further properties described 
above. 

Conversely, the double sequence mn given by (4.17) is always, if un<©, @ 
solution of (3.3). 


We conclude the consideration of this case with the following remark 
which will be useful in the next section: The function w(x, y) is uniquely defined 
by the set of equations (4.10) and the condition (4.11) and (4.18), even if we leave 
out a finite number of equations of the set (4.10). Let the equations with m<m’, 
n<n’' be left out of the set (4.10). The proof of the uniqueness of w(zx, y) is 
exactly the same as above, with the only difference that instead of (4.13) we 
associate with w(x, y) the function 


We consider now the second assumption (4.5). We know from I, §10, that if 
we write 


a B — bo 


then u=¢,,(x) is the polygonal line which joins the vertices 


(4.20") =)’) (p = 0,1,2,---) and (0, (0)™), 


a-— ao 


| 
| 


1933] SYSTEMS OF LINEAR INEQUALITIES 469 


and v=¥y,(y) is the polygonal line which joins the vertices 
ont b, 

4.20 (@=0,1,2,---) and (0, (0)"). 
00 


Let A, be the slope of ¢,(x) on the interval x,,:5*S-,, and similarly B, 
the slope of ¥,(y) on the interval 
Let 


1 1 
(4.21) = f m(X)Wn(y)d 2d yxo0(x, y) (m, n= 0, 1, 2, ) 


be a solution of (3.3), with xoo(x, y) monotonic in U, satisfying the conditions 
(3.18). Applying integration by parts in each subrectangle, we obtain 


2dyxoo(x, y) = xoo(x, y)dady 
IN 


1 1 1 
“ f xoo(#, 1)dbm(z) — f xo0(1, + f 
In IN 


1 
+ vatow) Yoo(x, yn)dobm(x) + xoo(1, 1) = XN) X00( XM, 1) 
*N 


— Wnlyw)xo0(1, yw) + yw). 


As N->, this goes over into 


Yq Zp 
= QUA f xoo(x, y)dady — >> Ay xoo(x, 1)dx 
P,q=0 Yost p=0 


(4.22) 
f xoo(1, y)dy + x00(1, 1). 
q=0 


Let us define in U a step-function x(x, y) as follows: 
x(x, y) = 0 on L; x(1, 1) - xoo(1, 1); 


1 zp vq 


for SX < Xp, Ye (6,9 = 9,1, 2,---); 


1 
x(x, 1) 


Zp 
—— xoo(%, 1)dx for 4, (p= 0,1,2,---); 
Xp — Xptir 


x(1, 9) 


1 
—— xoo(1, ydy for yer 
Va Vativy 


470 I. J. SCHOENBERG [April 


This step-function is immediately found to be monotonic in U and continu- 
ous along L. Moreover, from (4.22) we infer that 


‘a 
(4.23) -f f y) (m,n =0,1,2,---). 
0 0 


If we write 
= x(x,» +0, + 0) — x(x» + 0, Yq — 9) — x(x» 0, + 0) 
+ x(x» 0, 0), 


then from (4.23), (4.20’), (4.20’’) and the fact that x(x, y) is continuous along 
L, we derive 
= 


= — a0)?(B — bo) Avg 


with 


we finally obtain 
(4.24) = rvela Am)?(B — by)? (m, n=0,1,2,--- ). 
p=0 g=0 
We now immediately obtain the following 


THEOREM 4.2. Every solution timn of the non-determining Hausdorff system 
(3.3), derived from the matrices (4.1) whose elements satisfy the conditions (4.2), 
(4.3), and (4.5), admits the following minimal representation: 


= — am)?(B — bn)* + (0)" — am)? 


p=0 g=0 


q=0 


(m,n = 0,1, 2,-++ Ape 20, pp 2 0,0,20, 7 2 0). 


The non-negative coefficients oq, Pp, Tz and ¥ dre all uniquely defined by the set of 
equations (4.25). 

Conversely, the double sequence umn given by (4.25) is always, if poo<, @ 
solution of (3.3). 


p=0 
| 
° 


1933] SYSTEMS OF LINEAR INEQUALITIES 471 


Indeed, let umn be a solution of (3.3). From (3.3’), Lemma (3.1), and 
(3.13) we have 


1 1 
mn = m n d yXo " 
m(X)xXn(y)d 2dyxo0(x, y) 
(4.26) + (0)" f $m(x)dxo0(x, 0) 
0 


+ (0)™ f Yaly)dx0(0, y) + (0)™**x00(0, 0), 
0 


where, as we know, xo0(%, y) is continuous along L, and xo(x, 0), xo(0, y) are 
continuous at the origin. 
From (4.24) and Theorem 10.1 of I, we obtain 


1 1 
f f 24yX00 
0 0 


Drvela — am)?(B — bn)%, 


p=0 g=0 


bn f 0) = Lp(a — an)?, 
0 p=0 


0 q=0 


which are, as is easily seen, minimal solutions of (3.3), (3.1) and (3.2) re- 
spectively. With xo(0, 0) =, (4.26) goes over into (4.25). The uniqueness of 
the coefficients \y,, Pp, ¢, follows from known properties of power series. 

We pass now to the last assumption (4.6). In this case $»(x) is again the 
polygonal line joining the points (4.20’), while 


(4.27) = sy / bo) | 
Let again 
1 1 
(4.28) od yxoo(x, (m,n ) 


be a solution of (3.3), where the monotonic function x(x, y) has the proper- 
ties (3.18). 
From a theorem of Fréchet* we obtain 


* M. Fréchet, Nouvelles Annales de Mathématiques, (4), vol. 10 (1910), p. 253. 


i 
| 
bE 
o 
u 
{ 
. 


472 I. J. SCHOENBERG 


1 1 
(4.29) Ban =f 9). 
0 0 
Let us define in U a new function x(x, y) as follows: 
x(0, y) = xo0(0, y) = 0, x(1, ¥) = xoo(1, y) for OS y 


1 zp 
x(x, y) = xoo(x, y)dx for 4,05 
 & z 
(p = 0, 1,2,--+). 


This function x(x, y) is also monotonic in U, and from I, §10, we know that 


1 
f 9) = f dalz)dex(2, 9) 
0 0 
(4.30) 
=> (< =) Xp(y) for OS y $1, 
ao 


p=0 — 
where the 


Xp(y) = + 0, vy) — x(%p — 0, y) 


are also monotonic functions which are continuous at y=0. 
From (4.29), (4.30), and (4.27) we derive 


1 
> (< f 0) 
0 


p=d0 — do 
and if we write 


Ap(y) = (a — 
this becomes 


(4.31) Emn = D(a on)? p(y). 


In this last form we easily recognize that Z,,, is a minimal solution of the sys- 
tem (3.3). Introducing the monotonic functions 


1 
(4.31) becomes 


1 


Just as above we immediately derive the following 


[April 
| 


1933] SYSTEMS OF LINEAR INEQUALITIES 473 


THEOREM 4.3. Every solution mn of the non-determining Hausdorff system 
(3.3), derived from the matrices (4.1) whose elements satisfy the conditions (4.2), 
(4.3), and (4.6), admits the following minimal representation: 


1 
= — Gm)? bndwy 
Xa on)? fy 


(4.32) + (0)" — an)? + f yondo(y) + (0) 
p=0 0 


(m,n = 0,1, 2,---;pp 20,y 20), 


where all the functions w,(y) and o(y) are monotonic for 0<y <1 and satisfy the 
conditions 


(4.33) wp(1) = o(1) = 0, wo(y) = we(y — 0), o(y) = o(y — 0) for0<y <1, 


while all the integrals are improper and converge in the sense that lim..o f° 
exists. The functions w,(y), o(y) and the coefficients pp and y are uniquely 
defined by (4.32) and (4.33). 

Conversely, the double sequence timn given by (4.32) and (4.33) is always, if 
Hoo< +, solution of the system (3.3). 

From the results of this section we derive easily the solutions of Hausdorff 
systems of a somewhat different kind. Consider the two Vandermondean 
matrices 


(4.34) 
= 1 bo be 
1b 


where a,, and 6, are two increasing sequences for <m<+o0, <n 
<-+, and satisfying besides (4.3) one of the conditions (4.4), (4.5) or (4.6). 
An immediate consequence of Theorems 4.1, 4.2, and 4.3 is the following 


ee, 
| 
2 
A'=||1 ao ag | 
1 aq a? 
| 
} 
j 
it 


474 I. J. SCHOENBERG [April 


CoROLLARY 4.1. The most general solution of the new type of Hausdorff 
system 


(4.35) 20 = 0,1, 2,--- 5m, 8 = 0,41, 


is given by 


1 1 
(4.36) f f sdyo(x, y), 
0 0 


+00 +00 
(4.37) = — am)?(B — bn)? (Ape 2 0), 


p=0 q=0 


(4.38) Mmn an)? yndw,(y), 
p=0 0 


for m,n=0, +1, +2,---, these three representations corresponding respec- 
tively to the possible assumptions (4.4), (4.5), and (4.6). The functions w(x, y), 
w,(y) and the coefficients pq enjoy the properties, in particular the uniqueness 
properties, described in Theorems 4.1, 4.2, and 4.3. 


5. Completely monotonic functions of two variables. A function f(x) was 
said to be completely monotonic in an open interval (see I, §11) if it possessed 
derivatives of every ofder and 


(— 1)?f(x) 2 0 (p = 0, 1, 2,---) 


throughout this interval. 

Let f(x, y) be defined in an open region R. We shall say that f(x, y) is 
completely monotonic in R, if all the partial derivatives of f(x, y) of every order 
exist and 

y) 

(S.1) §) = (p,q ) 
throughout the region R. 

Hausdorff, Bernstein, and Widder have characterized the completely 
monotonic functions of one variable (see I, §11). I this section we shall deter- 
mine all the functions f(x, y) which are completely monotonic in a rectangular 
region 


(5.2) R(ao, Bo, a, B): ao<x<a, <8 
B<BS4+0). 


All possible cases will be taken care of if we consider successively the 


| 
| 
1 
A 


1933] SYSTEMS OF LINEAR INEQUALITIES 475 


following three regions: 


(5.2) a<x< +o, 

(5.2) aj <x<a, Bo<y<B (a,B< +), 
a<x<a, Box y<+o (a< +o), 
where a and Bo are either finite or else = — ©. 


Let us first assume that f(x, y) is completely monotonic in the region (5.2'). 
Consider the Hausdorff system (4.35) derived from the matrices (4.34) whose 
elements enjoy, besides (4.3) and (4.4), the further property 


(5.3) lim ad, = Qo, lim b- = Bo. 


It follows immediately from (5.1) that for any fixed value of x(>ap) and 
for p20, the function 
g(y) = (— 1)9(0?/dx?)f(x, 


is completely monotonic in y for By<y<+. Hence 
G(x) = Daf(x, bn) 

is completely monotonic for ay<x<+, since 

(— 1)G(x) = Dag(b,) 2 0 
(see I, §11, formula (11.3)). For the same reason we have 

bx) = Di[D2f(am, = D:G(am) = 0 
and therefore 
(5.4) = f(dm, bn) (m,n = 0, + 1,+2,---) 
is a solution of the Hausdorff system (4.35). 
From Corollary 4.1 we then derive 


1 1 
flom, Bs) = = 0, + 1, + 


where w(£, 7) has the properties given by Theorem 4.1. From the remark 
following Theorem 4.1, we infer that any element of the sequences @,,, b, may 
vary without affecting the function w(é, 7), hence 


1 1 
fs, 9) = J, E*nYdydyo(E, 0) in R(ao, Bo, + ©, + ©). 


The transformation 


i] 


476 I. J. SCHOENBERG 


leads to the following 


THEOREM 5.1. Every function f(x, y) which is completely monotonic in the 
region (5.2’) admits in this region the following representation: 


+00 teo 
(5.6) ix, = f J ydyr(u, 0), 


where the function r(u, v) has the following properties: 
7(0, 0) = r(u, 0) = 7(0, v) = O, r(u, v) = r(u + 0,0 + 0), 
(5.7) r(u’,v) S r(u’’, v), r(u, vo’) S r(u, v”’), 
A(r; u’, = — r(u’’, — r(u’, + v’) = 0, 


for0<u<+20,0<1< <u" <4+0,0850 


The improper Stieltjes integral (5.6) is absolutely convergent in R(ao, Bo, +”, 
+00) and the function r(u, v) is uniquely defined by (5.6) and (5.7). 

Conversely, every function f(x, y) defined by (5.6) and (5.7) is always, if 
finite throughout R, a completely monotonic function in R(ao, Bo, +, +). 


The properties (5.7) as well as the uniqueness of r(u, v) follow from the 
properties (4.11), (4.18), and the uniqueness of w(t, 7), by means of the 
transformation (5.5). The last sentence of the theorem follows from the 
relations 


art af(x, 


which are a consequence of (5.6) and where the double integral converges 
throughout the region R. 

A particular consequence of Theorem 5.1 is that f(x, y) is a real analytic 
and regular function of the real variables x and y in R(ao, Bo, +, + ©). 

We pass now to the consideration of a function f(x, y) which is completely 
monotonic in the region (5.2'’). Let the Hausdorff system (4.35) be defined by 
two sequences dm, b, with the properties (4.3), (4.5), and (5.3). Just as in the 
previous case, we conclude that (5.4) is a solution of the system (4.35). Hence 
from Corollary 4.1 we derive 


(am, bn) am)?(8 bn)? (m, n= 0, 2 1, + 2, 
p=0 g=0 


and since again any of the numbers a,, or b, may vary without affecting the 
coefficients \,,, we derive the following 


4 
(April 
+0 
J f uPy%e—— vd (u, v) 2 0, 
0 0 


1933] SYSTEMS OF LINEAR INEQUALITIES 477 


THEOREM 5.2. Every function f(x, y) which is completely monotonic in the 
region (5.2'’), admits in this region the following representation: 


+00 +c 
(5.8) f(x, y) = — — y)%, 


p=0 g=0 


where Xpq=0, from which it follows that f(x, y) may be analytically extended and 
is still represented by (5.8) in the region 


(5.9) ag < x < 2a — ao, Bo < y < 28 — Bo. 


Conversely, every function f(x, y) defined by (5.8), with \p,20, is always, if 
jinite throughout R, a completely monotonic function in R(ao, Bo, a, B). 


The last remark follows from the relations 


, 
p'=p \P q 
which follows from (5.8) throughout R(ao, Bo, a, 8). 

Let us finally consider a function f(x, y) which is completely monotonic in the 
region (5.2’’’). Let the Hausdorff system (4.35) be defined by two sequences 
am, b, with the properties (4.3), (4.6), and (5.3). Asin the two previous cases 
we derive from Corollary 4.1 the representation 


= — f in R(ao, Bo, a, + ©), 
0 


p=0 
and hence the following 


THEOREM 5.3. Every function f(x, y) which is completely monotonic in the 
region (5.2’'’) admits in this region the following representation: 


+00 
(5.10) f(x, = — x)», 


where the functions g,(y) are completely monotonic for By<y<+. The repre- 
sentation (5.10), which is also unique, converges and gives an analytic extension 
of f(x, y) in the region 


(5.11) ag<x<2a—a, 
Conversely, every function f(x, y) defined by (5.10) with the g,(y) completely 


monotonic for By<y< +, is always, if finite throughout R, a completely mono- 
tonic function in R(ao, Bo, a, +©). 


The last converse statement follows from the relations 


i 
p=0 
| 
| 
} 


478 I. J. SCHOENBERG 


which follows from (5.10) throughout the region R(ao, Bo, a, +). 
On account of the results of this section we may express the results of §4 
in the following 


+00 


Coroxtary 5.1. (1) The most general solution of the Hausdorff system 


(5.12) =0,1,2,---;m, = 0, + 1,+2,-+-), 
defined by the matrices (4.34), whose elements form increasing sequences satis- 
fying the conditions (4.3), is given by 

(5.13) Hmn = f(am,b,) (m,n =0,+1,+2,---), 
where f(x, y) is a function which is completely monotonic in the region 


(5.14) lim a,<x< lim a,, lim y < lim 


The function f(x, y) is uniquely defined. 

(2) The most general solution of the Hausdorff system 
(5.15) D:Doimn=0 (k, hy m,n = 0,1,2,---), 
defined by the matrices (4.1), whose elements satisfy the conditions (4.2) and 
(4.3), is given by 
(5.16) Hmn = bn) + (0)"gi(am) + (0) ™g2(bn) + (0)™*"y (y 2 90), 
where the functions f(x, y), gi(x) and g2(y) are completely monotonic for 


(5.17) a Sx < lim a,, bb S y < lim 


and are uniquely defined by the relations (5.16). 

In the second part of this theorem it is understood that f(x, y) is com- 
pletely monotonic in the interior of the region (5.17) and also continuous on 
the part of the boundary which belongs to this region. 


UNIVERSITY OF CHICAGO, 
Curcaco, ILL. 


4 
4 
| 


SUFFICIENT CONDITIONS FOR THE GENERAL 
PROBLEM OF MAYER WITH VARIABLE 
END POINTS* 


BY 
M. R. HESTENES 


1. Introduction. The problem of the calculus of variations to be con- 
sidered here is the general problem of Mayer with variable end points as pro- 
posed by Bliss (V, p. 305) and recently studied for a particular case in a 
joint paper by Bliss and Hestenes (XVI). As was remarked in the latter paper 
the general problem of Mayer is equivalent to the problem of Bolza, but the 
sets of sufficient conditions which have been given by Morse and Bliss for the 
problem of Bolza are not applicable to the problem of Mayer without further 
modification. In view of this fact it is the purpose of the present paper to 
establish a set of sufficient conditions for the general problem of Mayer with 
variable end points. The proofs here given are equally applicable to the prob- 
lem of Bolza considered as a problem of Mayer. 

The procedure used is similar to that used by Bliss for the problem of 
Bolza (XII, pp. 261-274). We first derive in §4 a further necessary condition 
analogous to that deduced by Bliss for the problem of Bolza. In §5 we con- 
struct an auxiliary problem of Mayer of the type discussed by Bliss and Hes- 
tenes (XVI). Their results are then applied in §§6 and 8 to the general problem 
of Mayer by methods closely related to those suggested by Mayer (XIII, 
pp. 436-465) and Hahn (XIV, pp. 127-136). 

2. Statement of the problem. In the following pages the notation and the 
terminology used by Bliss and Hestenes for a particular problem of Mayer 
will be used throughout (XVI, pp. 306-309). In addition it will be understood 
that the indices y, v have the ranges 

=1,---,p<2n+1. 
The general problem of Mayer is then that of minimizing a function g[m, y(x:), 
%2, y(x2) | in a class of arcs 
which satisfy the differential equations and end conditions 
y’) 0, Vu [x1, y(x1), X2, (x2) | = 0. 


* Presented to the Society, April 8, 1932; received by the editors June 9, 1932, and, revised, 
December 10, 1932. 

{ The Roman numerals in the parentheses in the text refer to the bibliographies at the end of 
the paper by Bliss and Hestenes, cited here as XVI, and at the end of the present paper. 


479 


| 
te 
te 
a 
‘ 
ti 


480 M. R. HESTENES [April 


As before, the arcs (2:1) and the functions ¢., g, y, will be assumed to have 
the continuity properties (a), (b), (c) (XVI, p. 306) in a neighborhood of a 
particular arc Ey whose minimizing properties are to be studied, the deter- 
minant (2:1) appearing in (c) being now interpreted as a (2n+2) X(p+1)- 
dimensional matrix of rank p+1. 

For the general problem of Mayer the first necessary condition as given 
by Bliss and Hestenes (XVI, p. 307) is modified as follows, and is readily es- 
tablished by the methods which they suggest. The theorem has also been 
established by Morse and Myers (X, p. 245). 

I. THE FIRST NECESSARY CONDITION. Every minimizing arc Eo for the 


problem of Mayer with variable end points must satisfy, besides the conditions 
(XVI, p. 307) 


(2:2) Fy, -f Fydx+ ci, = 0, 
the further relation 
2 
(2:3) (F — yiFyyv)dx + + = 0 
1 


for every set of differentials dx,, dyin, dx2, dyi2 satisfying the equations dy, =0, 
Xo being a suitably chosen constant. 

An admissible arc Ep is said to be normal relative to the end conditions 
y, =0 if there exist for it p sets of admissible variations &’, £”, ;’(x) such that 
the determinant | V,(£, 7’) | is different from zero (XVI, p. 307). For conven- 
ience an arc that is normal relative to the end conditions y, =0 will be desig- 
nated simply as normal. 


THEOREM 2:1. An admissible arc that does not satisfy the necessary condi- 
tion I is normal. 


This follows at once because an admissible arc Ep satisfies the necessary 
condition I if and only if every determinant of the form 
G(é, 0”) 


n°) 


vanishes, where £,’, £’, 7.7(x) are +1 sets of admissible variations for Eo, 
and the function G is obtained from g in the same manner as Y, is obtained 
from y, (V, p. 309). 

THEOREM 2:2. An admissible arc Ey that satisfies the necessary condition I 
ts normal if and only if there exist for it no set of multipliers d(x), not vanishing 


4 
| 


1933] SUFFICIENT CONDITIONS FOR THE PROBLEM OF MAYER 481 


simultaneously, with which it satisfies equations (2:2) and for which all (p+-1)- 
rowed determinants of the matrix 

Vass 
vanish. If Eo is normal the constant do can be chosen to be unity, the multipliers 
Aa(x) with which Eo satisfies the conditions (2:2) and (2:3) being then unique. 

This theorem is an obvious generalization of a theorem given by Bliss and 
Hestenes and can be proved by the same methods (XVI, p. 308). A similar 
theorem has been established by Bolza (III, p. 441). 

3. Theorems on extremals. It is known that in the problems of Mayer a 
non-singular extremal arc can be imbedded in a (2n—1)-parameter family of 
extremals (XVI, p. 311) 

(3:1) yi(x, Con—1)5 Na Na(x, Con—1) (x1 s x %2). 
Further properties of this family are given in the following theorem: 

THEOREM 3:1. Let Eo be a member of the (2n—1)-parameter family of ex- 
tremals (3:1) for parameter values (x10, X20, Co). If the matrix 
ic,(%1, c) 

Vie, (X2, c) 

has rank 2n—1 on Eo, then there is a neighborhood N of the ends of Eo in 
(xiyixey2)-space such that the end values of every extremal of the family (3:1) with 
ends in N satisfy a relation W(x1, ¥1, %2, Y2) =0. Conversely, every pair of points 
(x1, 1), (%2, Ye) in N satisfying the condition W =0 can be joined by an extremal 
E of the family (3:1), and by taking N sufficiently small the parameters (x1, x2, c) 
belonging to E will lie in a preassigned ¢-neighborhood of those belonging to Eo. 
The function W has continuous partial derivatives of the first two orders in N. 

The theorem can be proved as follows. Select 2m constants a;, b; such that 
the determinant 


(3:3) 


(2:4) 


(3:2) 


Vie,(*1, c) a 

Vie,(X2, 6) 

is different from zero on Ey. Consider now the equations 

= + Wai, 

= yi(Xe2, c) + 

These equations are satisfied by the set (x10, y10, X20, 20, Co, W =0) belonging 


to Zo. Furthermore the functional determinant with respect to the variables 
¢., W is the determinant (3:3) and is therefore different from zero on Ep. It 


(3:4) 


| 
4 
{ 
| 
i 
i 
| 
{ 
| 
{ 
| 
| 
{ 
j 
i 


482 M. R. HESTENES [April 


follows that equations (3:4) have a unique solution 
(3:5) Ce = Co(%X1, V1, X2, W = W(x1, 91, 


in a neighborhood WN of the end values (x10, 10, %20, Yeo) belonging to Eo. The 
right members of equations (3:5) have continuous first and second deriva- 
tives in N since the right and left members of equations (3:4) have such 
derivatives. If now the end values of an extremal are in NV, then these end 
values must satisfy the relation W (x1, 1, 2, ¥2) =0 since the solutions of equa- 
tions (3:4) are unique. Furthermore every set of values (x1, 91, %2, ye) in V 
satisfying the relation W =0 are the end values of an extremal E with par- 
ameter values [21, x2, c(21, V1, X2, ¥2) ], and by taking NW sufficiently small these 
parameter values will lie in a preassigned ¢-neighborhood of those belonging 
to Eo. Hence the theorem is proved. 

It is now possible to give an interesting geometric interpretation of nor- 
mality. 

THEOREM 3:2. A non-singular extremal arc Ey whose matrix (3:2) has 
rank 2n—1 is normal if and only if in the space of points (x1, y1, %2, Yo) the 
extremal end manifold W =0 and the terminal manifold y,=0 are not tangent 
to each other at the point (x10, Y10, X20, Yoo) defining the end values of Eo. 


To prove this it is sufficient, as is readily seen, to show that the derivatives 
W.,, Wy;,, Wz,, Wy,, are proportional to the elements of the first row of the 
matrix (2:4). These derivatives have this property because the relation F,,-; 
= constant along extremals (XVI, p. 307) with 7; =4;-,dc, implies that the dif- 
ferentials dx, dyin, dx2, dyi2, dc,, dW belonging to equations (3:4) satisfy the 


relation 
2 


1 


Fy (dy; — yidx) 


2 
+ hdW = hdW 
1 


where h=),F y,(x2) —a:F y,(x1). If h=0 on E> then on account of the relation 
F,,ni= constant, the determinant (3:3) would vanish on Ep which is not the 
case. Hence h¥0 on Ey and 


yak hW 2, Fy, (x1) hW 
as was to be proved. 
4, The necessary condition of Mayer. The necessary condition of Mayer 


for the problem of Bolza, as stated by Bliss (XII, p. 266), is also valid for the 
problem of Mayer considered here.* In order to derive this condition we sup- 


(3:6) 


* The proof is somewhat different from that given by Bliss for the problem of Bolza. He has 
called my attention to the fact that the argument which he used is inadequate in the case when the 
ends of Eo are conjugate, and has suggested the modifications indicated here. 


1933] SUFFICIENT CONDITIONS FOR THE PROBLEM OF MAYER 483 


pose that EZ» is a normal non-singular minimizing arc without corners and 
hence an extremal arc. If &, 2, 7;(x) area set of admissible variations for Eo 
which satisfy the conditions V,(, 7) =0, then Eo is a member of a one- 
parameter family of admissible arcs with ends satisfying the conditions y, =0 
and having é1, £2, 7;(x) as its variations along Ep (IX, p. 695). For such a fam- 
ily the second variation of the function g to be minimized is expressible along 
E, in the form 


2 2 
(4:1) In = yl Fy + 2Fyné | +20 +1,0,) + f 0, 
1 


where Q, Q, are quadratic forms in £, 7;:(1), £, 7:(%2) whose coefficients are 
the second derivatives of the functions g, ,, respectively, and 


2w(x, 1’) Fy + QF MK + nk 


This form for J; is readily obtained with the help of the transversality con- 
dition (2:3) by the methods used by Bliss and Hestenes (XVI, pp. 311-312). 
Let us consider variations satisfying the equations V,(é, ) =0 along Zo, and 
of the special form £:=dx, &=dx2, where the functions 
yi(x, c) are those defining the (2m —1)-parameter family (3:1) of extremals to 
which E> belongs. For such variations the second variation (4:1) can also be 
expressed in the form 


d’g = (F. + yiF,,)dx? + 2F,byidx 
(4:2) 
+ dy, dy’, 5d)| + 2(Q + 1,04) 
1 


given by Bliss (XII, p. 266), where 5A. =Yac,dc, and 
Q(x, 0, w) = (x, 2, 2’) + + ). 


Since Ey is a minimizing arc the expression (4:1) must be 20 for all sets of 
admissible variations £2, :(x) which satisfy the conditions V,(é, =0. 
In particular it must be 20 for variations :=d%x, £.=da2, ni =5y; of the spe- 
cial type considered above satisfying the conditions dy,=W,(dx, dy) =0. 
We have therefore the following result: 


IV. THE NECESSARY CONDITION OF MAYER. For a normal non-singular min- 
imizing arc Ey without corners the quadratic form (4:2) must satisfy the condi- 
tion d*g=0 for all sets (dx;, dx2, dc,)*#(0, 0, 0) which satisfy the equations 
dy, =0. Furthermore between the end points 1 and 2 on Ey there can be no point 3 
conjugate to 1 defined by a value x3 such that Eo is normal on the interval x3x2. 


| 
| 
4 
{ 
| 
| 
| 
| 
i 
{ 


4 
i 


| 
| 
| 


484 M. R. HESTENES [April 

The last statement is a slight modification of the condition IV deduced by 
Bliss and Hestenes for problems of Mayer having 2m+1 end conditions 
(XVI, p. 315), valid here for Zo since Ey must also be a minimizing arc for such 
a problem, as will be seen in the next section. 

5. An auxiliary problem of Mayer. In order to construct a problem of 
Mayer of the type described in the last paragraph we suppose that Ep is a 
minimizing arc for the general problem of Mayer considered here. Its end 
values (10, X20, Y20) satisfy the conditions y,=0 (u=1, - - -, p). Adjoin 
to the functions y,, 2n+1—p functions y,(%, yi, %2, ye) 
2n+1) possessing continuous first and second partial derivatives in a neigh- 
borhood of the values (x10, yi0, X20, 20), vanishing at these values, and having 
the determinant 


Suir Suis 
Woz, Woz, 


different from zero on Eo. The new set of end conditions y, =0 (p=1,---, 
2n+-1) defines an auxiliary problem of Mayer of the type discussed by Bliss 
and Hestenes. It is clear that Eo is also a minimizing arc for this auxiliary 
problem. 


(5:1) 


THEOREM 5:1. Let Ey be an admissible arc that is normal on the interval 
Xi0%-0 and satisfies the necessary condition I. If Ey is normal relative to the end 
conditions y,=0 (u=1, - - - , p), then it is normal relative to the end conditions 
(o=1, -- - , 2n+1) just defined. 


To prove this theorem we recall that the matrix (2:4) has rank p+1 since 
E, is normal relative to the end conditions y,=0. Furthermore since Eo 
satisfies the transversality condition (2:3), it follows that on E> the deriva- 
tives £2.) Zvi» S22 Sys, ate expressible as a linear combination of the rows of 
the matrix (2:4), the multiplier of the first row being different from zero. The 
rank of the matrix (2:4) formed for the new end conditions y, =0 is therefore 
unaltered when the elements of the first row are replaced by the derivatives 
Bey Suis» Sz» Six. Lhe matrix thus formed is the matrix of the determinant 
(5:1) and has rank 2n+2. Hence according to Theorem 2:2, Ep is also nor- 
mal relative to the end conditions y, =0, and the theorem is established. 

6. A fundamental sufficiency theorem. With the help of the auxiliary 
problem just constructed we can prove the following theorem: 


THEOREM 6:1. A FUNDAMENTAL SUFFICIENCY THEOREM. Let Ey be an 
extremal arc with the following properties: 


(A) Eo satisfies the sufficient conditions for a proper strong relative minimum 


| | 
H 


1933] SUFFICIENT CONDITIONS FOR THE PROBLEM OF MAYER 485 


with respect to admissible arcs C satisfying the end conditions y,(C) =0 of the 
auxiliary problem of Mayer defined in §5. 

(B) There is a neighborhood M of the ends of Ey in (x1yix2y2)-space such that 
the inequality g(E)>g(Eo) holds for every extremal E of the family (3:1) with 
ends in M satisfying the conditions ,(E) =0 and not identical with Eo. 

Then there exist neighborhoods § of Ey in xy-space and N of the ends of Eo 
in (x1ViXey2)-s pace such that the inequality g(C) >g(Eo) holds for every admissible 
arc C in § with ends in N satisfying the conditions ,(C) =0 and not identical 
with Eo. 

The proof is based on the following two lemmas, the proofs of which will 
be given in the next section. 


Lemma 6:1. (Modification of Hahn’s Theorem (XIV, p. 129).) The 
property (A) for Eo implies the existence of neighborhoods § of Ey in xy-space 
and M of the ends of Eo in (xyyix22)-space such that for every extremal E of the 
family (3:1) with ends in M the inequality g(C)>g(E) holds for every admis- 
sible arc C in § with ends in M satisfying the conditions y,(C) =y,(E) and not 
identical with E. 


Lemma 6:2. The property (A) for Eo implies that every neighborhood M of 
the end values of Eo has associated with it a second neighborhood N of these end 
values such that for every admissible arc C with ends in N there is an extremal E 
of the family (3:1) with ends in M satisfying the conditions ,(C) =y,(E). 


With the help of these lemmas the proof of Theorem 6:1 is as follows. 
Select first neighborhoods § of Eo and M of the ends of Ep effective as in 
Lemma 6:1 and as in (B). Select a second neighborhood N of the ends of Eo 
related to M as in Lemma 6:2. Consider now an admissible arc C in § with 
ends in WN satisfying the conditions y,=0. According to Lemma 6:2 there is 
an extremal £ of the family (3:1) with ends in M satisfying the conditions 
v,(E) =0, ¥,(E) =y¥,(C), where the functions ¥, are those adjoined to the 
functions y, to form the auxiliary Mayer problem defined in §5. From Lemma 
6:1 it follows that g(C)=g(£), and from the property (B) we have g(E) 
= g(Eo). Hence g(C) =g(Eo), the equality being valid only in case C coincides 
with Eo, as was to be proved. 

7. Proofs of two lemmas. In order to prove Lemma 6:1 we use the result 
obtained by Bliss and Hestenes (XVI, p. 323)* which states that the 

*In the proof of the theorem referred to here, the authors made use (XVI, Theorem 8:1) of a 
sugzestion in an abstract by Morse, Bulletin of the American Mathematical Society, vol. 37 (1931), 
p.37. Bliss and Reid proved Morse’s result independently before the complete paper of Morse (XVII) 
appeared. Bliss and Hestenes used the proof given by Bliss, which is similar to that of Morse, and 
inadvertently made no reference to Morse’s paper. The proof given by Morse should of course have 
priority. 


| 
| 


486 M. R. HESTENES [April 


property (A) for E> given in Theorem 6:1 implies the existence of a function 
W(a, - - - , @,) such that the m-parameter family of extremals 


(7:1) = yilX, X20, 2: = X20, a, Wa) (x1 S x S x2) 


contains Ey for parameter values (x10, %20, do) and has the determinant 
|-yia, | different from zero along Eo. Furthermore each extremal E of the family 
(7:1) has on it the element (x, y, z) = (x20, a, W.), where the a; are the param- 
eter values defining EZ. If now we select m—1 functions W,(ai, - - - , dn) 
having continuous first and second partial derivatives and such that the 
determinant |W.,W,.,| is different from zero for the values a;=a,o, then 
the (2n—1)-parameter family of extremals 


Yi = yilxX, X20, 2, Wa + b-Wra) = yi(x, a, 5), 


= 2:(%, X20, 0, Wa + = a,b) (41 S x S xe) 


contains Ey for parameter values (x10, X20, do, b=0). Moreover every extremal 
E of this family has on it the element (x, ;, 2;) = (x20, @:, Wa, +6-W,.,), where 
the parameter values a,, b, are those defining Z. The equations expressing 
this fact are the equations 


yi(X20, a, b), Wa; + = 2:(X20, a, b), 
and by differentiation it is found that the determinant 


Yio, O 

Zia, 2ib, 
is different from zero for the values (x, a, b) = (220, ao, 0). Hence the family 
(7:2) is one of the type (3:1), its multipliers \,(x, a, 6) being found in the 
usual manner (XVI, pp. 309-311). 

Since the determinant |+,.,| belonging to the family (7:1) is different 

from zero on Eo, the determinant |+;.,(x, a, 5)| belonging to the family 
(7:2) has the same property. Hence the system of equations 


(7:3) yi(x, a, b) 
has a unique solution 
a; = y, b) 


in a neighborhood D of the values (x, y, 6) belonging to Ey. The functions 
a,(x, y, b) are continuous and possess continuous derivatives of the first two 
orders in the domain D. If now we let 


pi(x, b) a(x, b), b], 


7:4 
( Aa(x, b) - hal, a(x, y; b), b], 


1933] SUFFICIENT CONDITIONS FOR THE PROBLEM OF MAYER 487 


then according to the condition IIy’ implied by the property (A) on Eo, the 
domain D can be so restricted that at each element (x, y, b) in D the in- 
equality 


E[x, y, p(x, y, 6), Mx, y, 5), y’] > 0 


holds for every admissible set (x, y, y’) ¥(x, y, p), where E(x, y, p, d, y’) is 
the Weierstrass E-function (XVI, pp. 317, 324). Furthermore on the hyper- 
plane +=220 in xy-space the Hilbert integral J* is independent of the path 
when the parameters 6, are fixed (XVI, p. 323, cf. XII, p. 269). It follows 
that for each set b, the region § of points (x, y), whose elements (x, y, d) 
are all in D, forms a field with slope functions and multipliers defined by 
equations (7:4) (XVI, p. 322). We have a family of such fields depending 
upon the »—1 parameters b,. In each field the Weierstrass E-function is >0 
unless y,’=;. Hence according to a theorem proved by Bliss and Hestenes 
(XVI, p. 319) there is a neighborhood M of the end values of EZ» such that 
every extremal EZ with ends in M and belonging to one of these fields furnishes 
a proper strong relative minimum for the function g in the class of admissible 
arcs C in § whose ends are in M and satisfy the conditions ¥,(C) =y,(£). 

Lemma 6:1 will now be established completely if we show that the 
neighborhood M of the ends of Ep can be restricted so that every extremal E 
of the family (7:2) with ends in M is a member of one of the fields just de- 
scribed. To do this we select a constant 4 so that the set [x, y, a:(x, y, 5), 5] 
with elements (x, y, b) in D is the only solution of equations (7:3) satisfying 
the relation 


(7:5) a(x, y,b)-hSa; S a(x, y,b) +h. 


This can always be done since the solution a;(x, y, b) of equations (7:3) is 
isolated. We now select a constant ¢ such that the inequalities 


| a; — aio| < h/2, 
| aio — ai(x, y; b) | < h/2 


hold along every extremal E of the family (7:2) with parameter values (x, 
%2, a, b) in an e-neighborhood of those belonging to Eo. The relation (7:5) now 
holds for every set of values (x, y, a, b) on E. It follows that a;=a,(x, y, b), 
and hence E is an extremal of one of the fields just described. This completes 
the proof of Lemma 6:1 since according to Theorem 3:1 the neighborhood M 
of the ends of E> can be so restricted that every extremal £ of the family 
(7:2) with ends in M has parameter values (x, %2, 2, b) in the e-neighborhood 
just defined. 


{ 

{ 


488 M. R. HESTENES 


In order to prove Lemma 6:2 consider first the equations 


(7:6) W(x, yo) 0, 
V1, X2, ye) = Mp, 

where W is the function defined in Theorem 3:1. As was seen in §3, the func- 
tional determinant of these equations is different from zero on Eo. Further- 
more equations (7:6) are satisfied by the set (x1, 1, %2, Yo, m) =(2X10, V0, X20, 
y20, 0) belonging to Eo. Hence there is a constant 4>0 such that equations 
(7:6) have a unique solution 


(7:7) = yu = yulm), 
= X2(m), = yia(m) 

for all values m, satisfying the relations |m,|<h. If h is sufficiently small, 
then according to Theorem 3:1 every pair of points (a, 91), (x2, v2) can be 
joined by an extremal of the family (3:1). Furthermore it is clear that, if 
necessary, the constant / can be further restricted so that every set of values 
(a1, V1, 22, ¥2) defined by equations (7:7) with |m,|<h is in a preassigned 
neighborhood M of the end values of Eo. If now we select a second neighbor- 
hood N of the end values of Eo so that every set of values (x1, yi, %2, ¥2) in NV 
satisfies the relation |,(a1, 1, x2, ¥2) |<, then every admissible arc C with 
ends in N determines a set of values m,=y,(C) satisfying the relation |m, | 
<h, and these in turn determine an extremal arc E with ends in M satisfying 
the conditions y,(Z) =y,(C). This proves Lemma 6:2. 

8. Sufficient conditions for relative minima. The necessary condition I is 
given in §2. The symbols II’, III’ will be used to denote the strengthened 
conditions of Weierstrass and Clebsch as defined by Bliss and Hestenes 
(XVI, p. 324). The symbol IV’ will be used to denote the condition IV 
of §4 strengthened so as to exclude the equality sign. With these definitions 
agreed upon we can state the following theorem: 


THEOREM 8:1. SUFFICIENT CONDITIONS FOR A STRONG RELATIVE MINIMUM. 
Let Eo be an admissible arc without corners and with end points determined by 
values X19, X29 and satisfying the conditions y,=0. If Eo is normal relative to the 
end conditions y,=0, is normal on every sub-interval x:0x3 of X10%20, and satisfies 
the conditions I, Ig’, III’, IV’, then there exist neighborhoods § of Eo in xy- 
space and N of the ends of Ey in (x:yi%2y2)-space such that the inequality g(C) 
>g(Eo) holds for every admissible arc C in § with ends in N satisfying the con- 
ditions ,(C) =0 and not identical with Eo. 


The theorem will be established if we can show that the hypotheses of the 


[April 


1933] SUFFICIENT CONDITIONS FOR THE PROBLEM OF MAYER 489 


theorem imply those of Theorem 6:1. It is easily seen from Theorem 5:1 and 
from the sufficiency conditions given by Bliss and Hestenes for the case 
p=2n+1 (XVI, p. 324) that Zp is an extremal arc having the property (A) 
of Theorem 6:1 provided that we can show that the condition IV’, as defined 
above, implies that the ends of Ep are not conjugate to each other. If the ends 
of Eo were conjugate then the constants dc, in the expressions 6y;=yic,dc, 
could be selected not all zero so that the differentials 5y; would all vanish at 
the ends of Eo. If we should take these constants dc, together with the values 
dx, =dx,=0, then the conditions dy, =0 would be satisfied and the expression 
(4:2) for d*g would vanish, which would contradict the condition IV’. Hence 
E, has property (A) of Theorem 6:1. 

To prove that E» has the property (B) of Theorem 6:1 we first note that 
the conditions I, III’ imply the existence of a family of extremals (3:1) con- 
taining Ey for parameter values (210, %20, Cso). From conditions I, IV’ it follows 
that dg =0, d?g>0 for every set of differentials (dx, dx2, dc.) ~(0, 0, 0) which 
satisfy the conditions dy,=0. But these are the conditions (XV, p. 115) 
which insure that g(21, Cs) > g(x10, X20, Cs0) for all sets (x1, x2, ¥ (x10, X20, Ceo) 
satisfying the equations y,(x:, %2, c,)=0 and lying in a sufficiently small 
e-neighborhood of (10, X20, ¢so). Furthermore since the ends of Ey are not con- 
jugate the matrix (3:2) has rank 2n—1 (XVI, p. 316), and according to 
Theorem 3:1 there is a neighborhood M of the ends of Zo such that every 
extremal with ends in M has parameter values (%:, %2, c,) in the e-neighbor- 
hood described above. It follows that g(x, %2, ¥2) >g(%10, X20, Yeo) for 
every extremal E with ends in M satisfying the conditions y,(£) =0 and not 
identical with Eo. Hence Ep has the property (B) of Theorem 6:1 and The- 
orem 8:1 is established. 

In a similar manner sufficient conditions for a weak relative minimum for 
the general problem of Mayer with variable end points can be established. 
The argument is like that of Bliss and Hestenes (XVI, p. 325) with the help 
of simple modifications of Theorem 6:1 and Lemma 6:1 above. The Theorem 
10:2 of Bliss and Hestenes remains valid here if we replace the phrase “pre- 
ceding theorem” by “Theorem 8:1” and the equations y,=0 by y,=0. 
Similarly Corollary 10:1 of the paper by Bliss and Hestenes is still effective 
if we replace “Theorem 10:1” by “Theorem 8:1” and y, by y,. 


BIBLIOGRAPHY 


The papers listed below are a continuation of the list at the end of the paper of Bliss and Hes- 
tenes cited here as XVI. 

XIII. Mayer, Zur Aufstellung der Kriterion des Maximums und Minimums der einfachen Integrale 
bei variablen Grenzwerten, Leipziger Berichte, vol. 36 (1884), pp. 99-128, vol. 48 (1896), pp. 436-465. 


490 M. R. HESTENES 


XIV. Hahn, Ueber Variationsprobleme mit variablen Endpunkten, Monatshefte fiir Mathematik 
und Physik, vol. 22 (1911), pp. 127-136. 

XV. Hancock, Theory of Maxima and Minima, Ginn and Company, 1917. 

XVI. Bliss and Hestenes, Sufficient conditions for a problem of Mayer in the calculus of variations, 
these Transactions, vol. 35 (1933), pp. 305-326. 

XVII. Morse, Sufficient conditions in the problem of Lagrange with fixed end points, Annals of 
Mathematics, (2), vol. 32 (1931), pp. 567—577. 


UNIVERSITY OF CHICAGO, 
Cuicaco, IL. 


THE STRUCTURE OF THE NUMBER OF 
REPRESENTATIONS FUNCTION 
IN ABINARY QUADRATIC 
FORM* 


BY 
GORDON PALL 


This paper contains, primarily, the extension to any integral, binary 
quadratic form of the results of a recent articlef concerning positive, binary, 
quadratic forms. With suitable conventions almost all the results carry over 
without change, though some of the proofs need slight alterations. Inciden- 
tally, there are treated automorphs of binary quadratic forms, and (rather 
fully) properties of sets of representations (representations equivalent 
through automorphic transformations) in a binary quadratic form. 

1. Dirichlet{ has already in all essentials extended the notion of number 
of representations to indefinite forms. We shall utilize the following equiva- 
lent definition. 

Two representations (x,y) and (z’, y’) of min the form f = [a, b, c],§ that is, 
two integral solutions of 
(1) ax? + bxy + cy? = m, 
will be called equivalent if they are transformable one into the other by 
integral automorphs of f. The class of all representations equivalent to a 
given one will be called a set of representations. The number of sets of repre- 
sentations of m in f will be denoted by f(m). (In MZ, f(m) denoted the num- 
ber of representations of m in f.) 

This definition becomes more interesting when we observe that, if 
d(=b?—4ac) >0, the number of sets of representations of m in [a, b, c] is 
equal to the actual number of solutions of (1) together with certain inequali- 
ties (cf. Theorem 9). The writer developed the theory of these inequalities 
before noticing that Dirichlet (§87, loc. cit.) obtains one such system. How- 
ever, we shall obtain a substantial improvement on Dirichlet’s inequalities 
and give a more complete discussion of the infinitely many alternative sys- 
tems. The treatment in §3 is fairly comprehensive. 

* Presented to the Society, October 29, 1932; received by the editors September 26, 1932. 

+ Mathematische Zeitschrift, vol. 36 (1933); this article will be referred to here as MZ. 

t Cf. §§86 and 87 of Vorlesungen iiber Zahlentheorie, 4th edition, 1894. 


§ We use Kronecker forms, for simplicity, throughout. Hence [a, b, c] stands for ax*-+-bxy+cy*. 
For the automorphs see §2. 


491 


| 
| 
| 
| 
| 
| 
| 
|_| 
{ 


492 , GORDON PALL [April 


It will then be observed that, once the least positive solution 4, “ of 
(2) 2 — du? = 
is known, the labor of representing a number by érial in ax?+bxy+cy? is, 


when d>0, on a par with the work when d <0. For example, for either of the 
equations 


e+2y=n, x? — =n, 


where 7 is a given positive integer, we need to try only the values y* such that 
O0<y?<n/2 to obtain a representation in every set. (For x?—2y*=n the 
inequalities of Dirichlet require us to examine 0 < y?<4n.) 

To obtain this improvement it is necessary to introduce a convention 
whereby solutions on the boundaries of the inequalities count as } instead 
of 1 (cf. Theorem 9). For example, if d=8, and f=[1, 0, —2], then f(n) is 
equal to the number g(m) of integral solutions (x, y) of 


(3) — 2y?,0 y S (n/2)'*%(n > 0), (— n/2)"? Sy S (— 0), 


except when +”=k? or 2k? (k integral) in which cases f(m) =g(m) —1. The 
condition of inequality may be replaced by |x| =>2y20, or | y| =x 20. 

If d=b*—4ac is negative or is a positive square, the number w of integral 
automorphs of f is 2, except that w is 4 if d= —4 and wis 6 if d= —3. Then 
the number of representations of m in f is wf(m). 

Unless otherwise specified each (binary) form in the sequel is a primitive, 
integral, binary, quadratic form of discriminant d, where d is a non-zero 
integer =0 or 1 (mod 4). For simplicity we do not make the usual convention 
that the forms are positive if d <0. 

For any d there are a finite number / of (primitive) classes of forms, say 
Co, Ci, +++, Cr+. Representative forms from these classes are denoted, 
respectively, by fo, fi, - - - , far We shall always take C, to be the principal 
class, which represents +1. The system of representative forms will be 
designated by S(=S.). The sum of the numbers of sets of representations of 
n in the h forms will be denoted by S(m), so that 


(4) S(n) = fo(m) + film) +--+ + frs(m). 


The system of classes C; constitutes under composition a finite abelian 
group with C> as identity element. We shall assume their behavior in this 
respect as known, and shall interpret C;C;, etc., as the product classes under 
composition. Further if f is a form belonging to a class F, f-' will denote the 
opposite form (belonging to the reciprocal class F-'); and if g belongs to G, 
then /’g* will denote any form of the class F’G*. 


1933] THE NUMBER OF REPRESENTATIONS FUNCTION 493 


An ambiguous class C is characterized by the equation C?=C); or by 
C =C-'; or by containing a form [a, }, c] in which a| b. 

We can now state our principal results. 

The function S() is a factorable function; for any relative-prime integers 
and ne, 


(5) S(myn2) = S(my)S(me2). 


An integer 1 is semiprime to d, by definition, if 1 is divisible by no prime 
p such that 


(6) p>2and p?|d, or p=2 and d=0 or 4 (mod 16). 


For any semiprime to d, we shall prove 


(7) S(n) = Lidl»), 

vin 
where v ranges over the positive divisors of m, and (d| v) is the Kronecker sym- 
bol. 

For example in (3), the number g(m) of integral solutions (x, y) is given by ~ 
g(n) =>.(2|v) unless +” is a square or the double of a square, and then g(m) 
=1+ >°(2|»). 

The system {fo(m), - - - , frs()} is reducible in the following sense: for 
every prime # not satisfying (6) and every integer a>0 there exists a matrix 
of h? numbers yi;(p, a) (7, 7=0, - - - , h—1) such that, for every integer m 
prime to #, 


(8) f(pm) = a)fi(m). 
7=0 


More precisely the following formulas hold. 

Let f be a form of discriminant d and F its class. 

If p|d but does not satisfy (6), p is represented by an ambiguous class C 
of discriminant d. If g belongs to C*F, 


(9) S(p°n) = g(n) for every integer 
If (d| p) = —1 (Kronecker symbol), 
(10) S(pm) = 0, f(p*n) = f(n), 


for every m prime to p and every integer n. 
If (d| p) =1 there is a form g of discriminant d representing p. For every 
integer n, 


(11) S(pn) + f(n/p) = fe(m) + 


| 

{ 

| 

| 

j 

| 

i 

| 

i 

| 


494 GORDON PALL 


Solving this relation as in MZ §6 we obtain 


(12) f(p°m) = 


a=0 


which holds for every integer a=0 and m prime to p. 

Finally let » satisfy (6). Again f(pm) =0 if p does not divide m. Now there 
exists a form g, of discriminant d’=d/p*, which may be characterized as 
representing every number represented by f. For this form, 

(13) f(p?n) = og(n) (every 


where @ is partially defined by 


o=2if —4, 
o =3 if d’ = —3.* 


o =1 if d’ < —4 ord’ is a square, 
(14) 
If d’ is positive but not square, employ the notation /;, , for the successive 
solutions of 
(15) — = 4, 


t,, u, being the least positive solution (as in §3.4 with d’ in place of d). Then 


a is the (least) positive-index such that 
(16) u,=0, O<k <a) (mod 
To understand these formulas properly we should observe that 


(17) If pisa prime, p and —> are each represented in one of the classes of 
S unless (6) holds or (d| p) = —1; 
(18) Either is represented in at most one class and the reciprocal class. 


A class and its reciprocal, being improperly equivalent, represent the same 
numbers. If d<0 the classes of S will occur in pairs, each class being accom- 
panied by its negative. 

The generality of our results as holding even for d square (but ~0) may 
be emphasized. 

All the results of MZ, with the slight changes obvious from the preceding 
statements, hold for any integral binary quadratic form with discriminant 
d+0. If d<0 the form —/f, of exponent 2 may be adjoined to the basis of 
MZ $1, if desired. 

It is interesting to obtain a formula for the y;; of (8). We can choose 


* The reader will note that if f(m) meant the actual number of representations instead of the 
number of sets we should have f(~*) = g(n’ in all the cases in (14). 


[April 


1933] THE NUMBER OF REPRESENTATIONS FUNCTION 495 


Wiu=wi; if f; and f, belong to reciprocal classes. Then ¥;;=3f.(%g), where ¢ 
is any prime represented in f; (and f,). We may choose q so that g#p and 
(d|q) =1. Hence 


(19) @) = + (where fift = fo). 


Suppose (d| p) =1, g(p) >0. Then 2y.;(p, a) is the number of the elements of 
the sequence g*, g*-*, -- - , g~* belonging to the class of f;f; or its reciprocal, 
plus the number in the class of fi, or its reciprocal. 

2.1. Automorphs. We prove the following theorem. 


THEOREM 1. Let a, b, c be integers of g.c.d. unity, set d=b?—4ac and suppose 
d+0. Then all integral automorphs (of determinant 4-1) of [a, b, c] are given by 


x = 3(¢ — bu)xo — cuyo, 


(20) 
y = auxo + + bu) yo, 


as (t, u) ranges over all integral solutions of 
(21) — du? = 4, 


Let J denote the identity matrix. If T is a non-singular matrix, T-'JT =I. 
Hence we have the following lemma: 


Lemma 1. If a form has only the two automorphs with matrices +I, the same 
is true of all equivalent forms. 


First let d be a positive square. Set d=A?, A>0. Then (21) has only the 
trivial solutions (+2, 0) (for which (20) has matrices +/J). The theorem will 
therefore follow if it holds for a form equivalent to f=[a, b, c]. Now f is 
equivalent to one and only one of the ¢(A) forms 


(22) [k, A, 0],0 S$ k <A, k prime to A, 
and it is easy to show that these have only the two trivial automorphs.* 


* For a primitive form (Ax+py) (vx+py) is equivalent to a form x(cx+ry), where, by the dis- 
criminant, r= +A. Replacing y by y+«x we can alter o by multiples of A. Evidently [k, A, 0]~’ 
[k, —A, 0], where ~’ means “is improperly equivalent to”. Hence the fact that (22) constitutes a 
complete representative system of forms will follow once we prove that [k, A, 0]~’[I, A, 0] when 
kl=1 (mod A), and that [k, A, 0]~[I, A, 0] only when k=/ (mod A). To prove both these facts and 
to obtain all transformations carrying [k, A, 0] into [/, A, 0], compare coefficients in the identity 

(ax+By) [k(ax+By)+A(yx+ by) ]=x(le+Ay), 
where & and / are given prime to A, O<k<A, 0S/<A. Then either 8=0 or k8-+-A5=0. The former case 
leads to ai=1, k+Ay=l, whence k=/ and y=0 and the transformation matrix is +J. The latter 
case leads to 
ai—By=—B(ak+Ay)/A=—1, B=+A, a=+l, 6=Fk, y= F(kl—1)/A. 

This footnote and formula (22) will be useful to the reader who may wish to verify that proper- 

ties which he knows to hold for d not square continue to hold when d is a square+0. 


| 
| 
| 
| 

| 

| 


496 GORDON PALL [April 


Second we shall indicate a uniform proof, valid at least when d is not a 
square, by a modification of Dickson’s Introduction to the Theory of Numbers, 
§$60 and 69. By taking R*=d, R>0 if d>0, —iR>0 if d<0, we can define 
first and second roots in §60 for any form with da~0. Then Theorems 72 
and 73 hold unchanged together with their proofs, at least if d is not square. 
Also §69 holds for any non-square d. 

2.2. Proper sets of representations. If (x, y) and (xo, yo) are related by 
(20) we say that (x, y) and (xo, yo) are equivalent representations in f. As 
already defined, all (x, y) equivalent to a given one comprise a set. 


THEOREM 2. The g.c.d. of x and y is the same for all (x, y) of a set. 

This is evident on solving (20) for x» and yo. We may now speak of proper 
sets, that is, sets in which the g.c.d. is 1. 

2.3. Proper sets and the congruence z2?=d (mod 4m). Let = denote the 
aggregate of solutions z of 


(23) = d (mod 4m),0 <2|m|, 
such that 
(24) z, m, and (z? — d)/(4m) have g.c.d. 1. 


We shall set up a (1, 1) correspondence between the elements z of 2 and the 
various proper sets of representations of m by the 4 forms in S. 


For any z write 2?—d=4ml. Then ¢=[m, z, 1] is a primitive form of 
discriminant d, and hence is equivalent to just one of the forms of S, say to 
f=|[a, b, c]. For brevity we shall write 


(25) ra(* — cu 


y au 4(¢+ bu) 


If T is the matrix of one transformation of determinant +1 carrying f into 
¢, then the totality of such matrices is given by AT as ¢, u range through all 
integral solutions of (21). But then x and y are relative-prime and m=ax? 
+bxy+cy?, that is, (x, y) is a proper representation of m in f. And the class 
of first columns of the matrices AT is a set of representations of m in f. 

Conversely, let (x, y) be a proper representation of m in f. We can choose 
integers n and £ so that 


(26) an — yE = 1, 
and then the general form of such integers is §+-¢x, n+¢y, where ¢ is an integer. 


On applying the transformation with matrix T to f we derive ¢=[m, n, 1] 
where m=ax?+bxy+cy? and 


(27) nm = 2axt + b(xn + yt) + 2cyn, 


1933] THE NUMBER OF REPRESENTATIONS FUNCTION 497 


while / is determined by the discriminant. If we replace — and n in T by 
£+tx and n+1ty, m is unchanged but m is replaced by »+2im. If we replace 
(x, y) by an equivalent representation, so that T is replaced by AT or a 
parallel matrix,* we again derive ¢ or a parallel form. Thus the (1, 1) cor- 
respondence is established. 

THEOREM 3. Let a, b, c have g.c.d. 1, let d=b?—4ac+0, and let m be an 
integer ~0. Let f’(m) denote the number of proper sets of representations of m in 
f=la, b, c]. Then f’(m) is equal to the number of roots z of (23) such that 
[m, 2, (22—d)/(4m) ] is equivalent to f. 

2.4. On S’(n) and S(n). Evidently f(m) =>°f’(m/q?), where g? ranges over 
the square divisors of m. The number of proper sets of representations of m 
in Sis 
(28) S'(m) = fo (m) + fim), 
and is equal to the number of solutions z of (23) and (24). Also, 

(29) S(m) = 


Proceeding as in MZ but now allowing ” to be negative as well as positive, 
we see that S’(m) and S(n) are factorable (MZ, §2).f Further we have 


(30) S’(1) = S’(— 1) = S(1) = S(— 1) = 1, 
(31) S'(n) = S'(— n), S(n) =S(—n) (every n). 


For any prime p= 2 we have, using the Kronecker symbol (d| p), 
(32) S'(p*) = 1+ (d| p) if p does not divide d, 
S(p*) =a+1 if (d| p) = 1, 
= 3{1+(— 1%} if = —1, 
=1 if p| d but (6) does not hold. 


(33) 


If p does not satisfy (6) we have therefore 

(34) S(p*) = 1+ (|p) |p»). 

Hence, if is semiprime to d, 

(35) S(n) = Dl») 

summed for the positive divisors vy of m. In calculating S(m) it is generally 
simpler to factor into primary components, ” = +IIp*, and to employ (33). 


* Two matrices like T are called parallel if their first columns are identical and their second 
columns differ by an integral multiple of their first columns. 

¢ S(n) and S’(n) are used here instead of r(m) and r’(n) in MZ. The value of S(*) for all cases 
may be read from the table of values of r(p*) in §3 of MZ. 


{ 
| 
| 
{ 
i 
5 
H 
i 
i 


498 GORDON PALL [April 


2.5. Representation of p or —p. Let m=+p where # is a prime. The 
number of roots of (23) and (24) is 0 if (d| p) = —1 or if (6) holds, 1 if p|d but 
(6) does not hold, and 2 if (d| p) =1. In the second case the root of (23) is 
z=0 if d is even and # is odd, or if d=8 (mod 16) and p=2; the root is 
z=p if d is odd, or if d=12 (mod 16) and ~=2; whence the form 


[+ 


associated is ambiguous. In the third case the two roots are of the forms z, 
2p—z, where 0 <z</p, and the classes which represent p are represented by 
the two forms 


[+ p,2,---], [+ p, 2p—2,---], 
and are improperly equivalent. These facts prove (17), (18), and the following 
theorem. 


THEOREM 4. Let m= +p, p prime. Let f denote a primitive form represent- 
ing m, and let F be the class of f. Then 


(36) f(m) =2 if p does not divide d and F is ambiguous; 
=1 if p|d or if p does not divide d and F is not ambiguous. 


In case p|d, F is necessarily ambiguous. 


2.6. Representation of + p:p2, (d| p;) =1. As in MZ §4 we may prove the 
following result. 


THEOREM 5. Let p; and pz be distinct primes such that (d| p:) =(d| p2) =1. 
Let €, and € be signs + or —. Let g; represent €:p; (i=1, 2). Let G; be the class of 
gi (t=1, 2). Let f denote any form of the product class GiG2. Then f represents 
m= and 


(37) f(m) =4, 2 or 1 according as G,G2 coincides with all, just one, or none of 
the classes GyGz"', Go'G2, 


3. Some properties of sets of representations. By definition, (x, y) and 
(xo, Yo) are equivalent representations in a form [a, 6, c], or belong to the 
same set of representations in [a, b, c] if solutions of (2) exist satisfying (1). 
If there is no ambiguity as to the form involved we may write (x, y)~(xo, yo). 

3.1. Transformation of sets. We prove the following theorem. 


THEOREM 6. Let f=[a, b, c] and g=[a’, b’, c’] be primitive integral forms 
of discriminants d and de? respectively, deX0. If 


(38) x=ax'+ By’, y = yx’ + by’, a, B, y, 5 integers, 


* T.e., according as all, just one of, or none of G:, G2, and GG; are ambiguous. 


1933] THE NUMBER OF REPRESENTATIONS FUNCTION 499 


where a5 —By =, is a transformation of f into g, and if (x’, y’) and (x¢, yd) 
are equivalent representations in g, then 


(39) (x, y) = (ax’ + By’, yx’ + dy’), (x0, yo) = (axd + By, xd + 590) 


are equivalent representations in f. 
For we have 
(40) a’ = aa’? + bay + cy*, = 2aaB + + By) + 
c’ = af? + + cd?. 
By assumption, integers ¢t’, u’ exist satisfying 
(41) — dey’? = 4 
and such that 
x’ = — — 
y’ = a'u'xg + 
Using (39) and (40) it is easy to verify that (20) holds with 
(43) t=,u=w'e. 


(42) 


3.2. Opposite and ambiguous sets. It is plain from (20) that either of the 
relations 


(44) a| by, c| bx 


holds for all or none of the elements (x, y) of a set. We call such a relation an 
invariant of the set. Another example was the g.c.d. of §2.2. 
If a| by then with each integral solution (x, y) of 


(45) ax*+ bey+cy =n 

is associated a solution (x’, y) where x’ = —by/a—x. It is easy to verify that 
if (x, y) and (xo, yo) are related by (20), then 

(46) x’ = + bu)ad + cuyo, y = — auxd + 3(t — bu)yo, 


where x/ = —byo/a—2o. Hence (x’, y) and (x¢, yo) are in the same set of 
solutions in [a, 5, c]. The set thus associated with a given one in which a| by 
will be called the x-opposite set. A set in which a| by, and which coincides with 
its x-opposite set, will be called x-ambiguous. We shall see later that a set is 
x-ambiguous if and only if it contains an element (x, y) in which dy? has one 
of the values 


(47) 0, — 4an, (4; — 2)an, — (t1 + 2)an. 


| 
| 
| 
t 
| 
| 


500 GORDON PALL [April 


Similarly, if c| bx in a set, there is associated a y-opposite set of representa- 
tions (x, y’) in [a, b, ¢], y’ = —bx/c—y. A y-ambiguous set is one which coin- 
cides with its y-opposite set. Later we shall prove that a set is y-ambiguous 
if and only if it contains an element (x, y) in which dx* has the value 


(48) 0, — 4cn, (t: — 2)em, or — (ty + 2)en. 


Excluding from consideration the trivial set which consists of the single 
element (0, 0) we have the following theorem. 


THEOREM 7. Let a, b, c be relative-prime integers, ac¥0, and let both a| by 
and c| bx hold for the elements of a set. Then the two opposite sets coincide if and 
only if ac| b. 

For in order that they should coincide it is necessary and sufficient that 
for each (or some) element (x, y) of the set there shall exist integers #, u 
satisfying (21) and 

— by/a — x = 3(t — bu)x — cu(— bx/c — y), 


(49) 
y = aux + 3(t + bu)(— bx/c — y). 


Multiply (49:) by 3(¢+5u), (492) by cu, and add, obtaining the first of the 
following equations, the second being a rearrangement of (492): 


ax{1+ 3(¢+ bu)} = y{acu — b(t + bu)}, 


(SO) 
cy{1+3(¢+ bu)} = {acu — + bu)}. 

Now if ac] b, the integers t= (b?—2ac)/(ac), u= —b/(ac) satisfy (21) and 
(49). Hence it remains only to show that (50) implies that ac| bd. 

If, in (50), ¢-+bu= —2, we have 

y(acu + b) = 0 = x(acu + 3B), 

whence u = —b/(ac) is an integer. Suppose ¢+bu + —2. Then (50) implies 
(S1) ax? = cy*, 
If now ac does not divide 6, (51) will have to be satisfied by every element 
(x, y) of the set, which is impossible unless the number of elements in the set 
is 2 (or the set contains both (x, y) and (x, —y), whence 6=0). Finally let 
the set consist of two elements (x, y) and (—x, —¥). If the two opposite sets 
coincide, either —x—by/a=x and y=—y—bx/c, or —x—by/a=—-x and 
y =y+bx/c; the first case is impossible and the second implies ) =0. 

If a| by in a set, it is x-ambiguous if and only if for each (or for some) (zx, ) 
of the set there exists a solution (t, u) of (21) such that 


2ax + by = 2ax — by) + fduy, 


(52) 
y = 3u(— 2ax — by) + 


1933] THE NUMBER OF REPRESENTATIONS FUNCTION 


These may be combined into the single equation (cf. (21)) 


(53) 


2ax + by 
27 


y u 
where u =0, ¢=2 is equivalent to 2ax+by=0, and u=0, t= —2 is interpreted 
as y=0. 

Similarly a set is y-ambiguous if and only if c| bx and for each (or some) 
(x, y) of the set there exists a solution (t, «) of (21) and 


Here (¢, u) =(+2, 0) corresponds to 2cy+bx=0 or x=0. 

3.3. A congruencial property of sets mod p. Let a, b, c be relative prime 
integers, d=b?—4ac+0. Let p be any prime not dividing ad and such that 
(d| p)=1 (Kronecker symbol). Then there are just two distinct roots m, 
of 


(55) am? + bm +c =0 (mod p), OS m< 


(54) 


x 


For any integer n, each integral solution (x, y) of 
(56) ax? + bxy + cy? = pn 
satisfies one and only one of the three conditions 


x=my,y #0; x = my,y F 0; 


(57) s=y=0 (mod 


It is easy to verify that each of the conditions (57:), (572), (57s) is an invariant 
property of any set of solutions (x, y) of (56). 

As regards (573) this fact is evident from Theorem 2. Assume in (20) that 
Xo = myo (mod p). We readily deduce x=myy (mod 

3.4. Concerning the equation #?—du?=4. For the remainder of §3 we 
shall assume that dis positive but not square. Accordingly d25. 

All solutions (¢, «) of (21) in integers are (¢,, u,) and (—é,, —ux), where 
to=2 and u=0, (t:, 1) is the “least positive” solution of (21), while the re- 
maining solutions are linked together by the equations 


= titk duyux, 


(58) 
= Ute + tmx, 


valid for all integers k and 1. 


502 GORDON PALL 


Hence it is readily proved that 
= tk, = — Uk, 


<::: 
(S9) 


Here and later ¢o/uo is interpreted as +. 
It will be useful to note the relations 


= 


(60) 


which hold for all integers /, 7, 7 such that the denominators are different from 
zero. To prove these formulas, cross-multiply and use (582) and (59;). The 
second result is related to the first since 

th—k 


(61) . =d 
Un + Uy Un — Uy 


Since 4) =2 and u)=0, we have, with obvious conventions for the cases 
where the denominators vanish, 


= 
+ Ux 
2 d( + Ux) 
= > 


be Urk+1 +h 


(62) 


3.5. Distribution of the solutions in a set relative to ¢,, uz. Let (xo, Yo) 
denote any given solution in a set of solutions of (45). Rearranging (20) 
slightly we see that the aggregate of solutions (x, y) of the set are given by 
(xx, ye) and — ye) (k=0, +1, +2, ); where 

+ = 3(2ax0 + byo)te + Zdyour, 
Yr = 3(2ax% + byo)ux + 

A pair of equations equivalent to (63) is 

+ dx, = 3(2cyo + + $d xou_x, 
Xe = 3(2cyo + bxo)u_z + 
It is convenient to write 


(65) Xi = by:, = + 


(63) 


(64) 


[April 
tox + 2 

U2k 
|| 

Usk 


1933] THE NUMBER OF REPRESENTATIONS FUNCTION 


Hence we have 
2y-1 = — 2X1 = Xoli — dyous, 


(66) 
2x1 = + Yous, 2V-1 = + dxom. 


Substituting for ¢, and from t.41=3hit,+ and t+ and 
employing (66) we obtain the following four systems each equivalent to (63) 
or (64): 

(67) = — y-ite + = — + Yours; 

(68) = — + = — + 

(70) = — + Voters, = — Votes. 


Since X,2—dy,2=4an, we have | X;| >R|yx| if an>0, and R|yx| >| X;| 
if an <0, where R =d"/?. Since also t, > Ru; it is evident from (63) that X;, has 
the same sign as Xo if am>0, and that y, has the same sign as yo if an <0. 
Similarly from (64), VY, has the same sign as VY» if cn >0, and x, has the same 
sign as if cn <0. 

From (67)-(70) with k=0 or —1 we have, on performing certain subtrac- 
tions, 

du;(yo — y-1) = (t1 — 2)(X-1 + Xo), 

ui(Xo — X-1) = — 2)(y-1 + yo), 
— — = (4: — 2)(x-1 + 40), 
— duy(x%o — x1) = (4, — 2)(¥-1 + Yo). 


But (xo, yo) may be any element of the set. Hence, incorporating the preced- 
ing result, we have the following: 

if an>0, every X; and y,—~yx-1 has the same sign; 

if an <0, every y, and has the same sign; 

if cn>0, every Y;, and x,_1—x;, has the same sign; 

if cn <0, every x, and Y,_1—Y;, has the same sign.* 


(71) 


Hence we can choose an element (xo, yo) of the set such that (for every 
integer k) 
(72) X,>Oand y1<0 <yoif an>0; y,>0 and X_,<0 if an<0; 
or such that 
(73) and x,<0 <2 if cn>0; and Y_1<0 if cn<0. 


* It is easy to deduce from (71), if a>0, b=0, c<0, that 
if n>0, every Xx, xk, Ye—Ve-1, Ve_1— Yi, has the same sign as xo; 
if n<0, every Vx, ye, Xe-1—Xk, has the same sign as yo. 


504 GORDON PALL [April 


For this choice of (0, yo) in case (72) we write 


(74) = — yo/y-1 if an > 0, X = — X0o/X_1 if on < 0; 

and in case (73) we write 

(75) A= — if on > 0, X= — Vo/Y_i if cn < 0. 
Thus 


In these respective four cases, we have by (59;) and (67)-(70) the following 
results: 


tet Migr V-1-k thar + Me” 
tet Mit — Vive + Me 
Ye _ + — + Aux) 


(k = 0, 1,2, 3,---). 


It is evident from (53), (54), and (62) that an x- or y-ambiguous set is 
characterized by having \ =0 or 1 in (76) or (77) respectively. Write «=1 or 
—1 according as an>0 or an<0. Then as (x, y) runs through one half the 
elements of an x-ambiguous set (the other half consisting of the values 
(—x, —y)) the ratio (2ax+bdy)/y assumes precisely once each of the values 


t 2 
(78) > (k= 0, +1,+42,---), 


U2h+r 

d having a fixed value 0 or 1 for the set. (Corresponding to \=h=0 this rela- 
tion is to be interpreted as 2ax+by=0 if «= —1 and as y=0 if e=1.) For 
a y-ambiguous set we interchange a and c, x and y in the preceding. In view 
of (2ax+by)*—dy? =4an we have (47), and similarly (48). 

It follows that for an x-ambiguous set one of 4an, —4and, (t:+2)an, or 
— (t:—2)an is a square; and similarly for a y-ambiguous set with a replaced 
by ¢. 

A glance at (76) (where now 0 <A <1 or\>1) demonstrates the following 
theorem.* 


* A unification of two types of interval is obtained by means of (62). 


1933] THE NUMBER OF REPRESENTATIONS FUNCTION 505 


THEOREM 8. Let the discriminant d of the primitive integral form [a, b, c] 
be positive but not square. If a set of solutions of (45) is not x-ambiguous, it 
contains precisely one pair (x, y) and (—x, —y) satisfying 
2ax+b +2 
(79) y #0 and | +> if on >0, 


uy 


+b 
(80) 0< 


< if an <0. 

More generally, let k denote any integer =0. In any set which is not x-ambiguous 
occurs just one (x, y) and (—x, —y) satisfying 


+2 


(81) 
Uk+1 
(|2ax+b — 2 
Uk Uk+1 


We may designate as Theorem 8’ the analogous result for non-y-ambigu- 
ous sets, obtained by interchanging x and y, a and c throughout Theorem 8. 

Every solution is contained within one of these intervals. 

By (21) and (45) the preceding systems of inequalities may be replaced by 


the following, & denoting any integer 20: 
(83) an(t, — 2) < dy? < an(tiy1 — 2) if an > 0, 
— an(t, + 2) < dy? < — an(tiy1 + 2) if an < 0, 


if the set is not x-ambiguous. In every such set there is precisely one (x, ¥) 
and (—x, —¥y) within each of the intervals (83) for k=0, 1,2, - - - ; foranon- 


y-ambiguous set the corresponding intervals are 
(83’) cn(t, — 2) < dx? < cn(tey1 — 2) if cn > 0, 
— cn(ty + 2) < dx? < — cn(tiy1 + 2) if con <0. 


It may be noted that, if b=0, then y satisfies (83) if and only if x satisfies 
(83") for the same k. 


THEOREM 9. Let the discriminant d of the primitive integral form |a, b, c] 
be positive but not square. Let (ti, u:) be the least positive solution of t?—du? =4. 
Then the number of sets of solutions of ax*+-bxy+cy? =n is equal to the number 
of solutions (x, y) of this equation satisfying 


(84) 2| an| — 2an < dy* S t,| an|— 2an, y= 0, 


with the convention that solutions with y at an end point of this interval are 
counted as 3. 


506 GORDON PALL [April 


In place of (84) we may use 
(84’) 2| cn| — 2cn < dx? cn| — 2cn, x = 0; 
or indeed any of the infinitely many intervals (79)-(83’), with < in place of <, 
and with the same convention for end points. 

4.1, Reduction formulas for primes p not dividing d. If (d|~)=—1, (56) 
requires x=y=0 (mod ), so that (10) is obvious. 

Hence let (d|~)=1 and employ the notations and hypotheses of §3.3. 
Then (56) requires 
(85) x= my + 
X integral (¢=1 or 2). The equation (85) defines a (1, 1) correspondence 
between the integral solutions (x, y) of (56) satisfying x=m.y (mod #) and 
the integral solutions (X, y) of 


(86) n = apX* + (2am; + b)Xy + p-'(am? + bm; + c)y?. 

Here gi=[ap, 2am;+5, - - - ] is a primitive integral form of discriminant d. 
The solutions (X, y) of (86) in which p| y correspond to solutions (X, Y) of 
(87) n/p = aX? + (2am; + b)XY + (am? + bm; + o)Y’, 


under the transformation y=pY. The form in (87) is equivalent to f= 
[a, b, c]. Hence to conclude for every integer n that 
f(pn) = f(n/p) + — f(n/p)} + — f(n/P)}, 
that is, 
(88) S(pn) + f(n/p) = gr(m) + g2(n), 
we have to prove that if (x, y) and (xo, yo) are equivalent solutions of (56), 
then (X, y) and (Xo, yo), where 
(89) x= miy+ pX, xo = mio + pXo, 
are equivalent solutions of (86), and conversely. The converse holds by 
Theorem 6. Assume that (20) holds. Then by (89), 
X = p(x — my) = 4[t — (2am; + b)ul]Xo — p-'(am2 + bm; + c)uyo, 
y = apuXo + dyolt + (2am;+ b)u], as required. 
The remaining developments of §6, MZ, may now be carried through, if 
we use Theorems 4 and 5. 
4,2. Reduction formulas for primes dividing d. First, let p>2, d=b? 


—4ac, p|d, where a, b, c are relative-prime integers. Then p does not divide 
a orc, say p does not divide a. We can choose integers A and Q such that 


(90) 2Aa +QOp=1. 


1933] THE NUMBER OF REPRESENTATIONS FUNCTION 


Then the equation 

(91) ax? + bxy + cy? = pn, 
being equivalent to (2ax+by)? —dy?=4apn, implies 
(92) = —Aby+ pX, X integral. 


Second, let p=2, 4|d. Then a or c is odd, say a. Then (91) implies x =cy (mod 
2), so that in place of (92) we have 


(93) x = 2X if cis even, x = 2X + y if c is odd. 


Thus (92) or (93) sets up a (1, 1) correspondence between the integral 
solutions (x, y) of (91) and the integral solutions (X, y) of 


(94) apX? + WXy+cy? =n, 
where, in the respective cases (92), (931), and (932), 
b’ = 604, c’ = (aA*b? — Ab? + c)/p (p > 2); 
(95) b’ = 8, c’ = te (p = 2,¢ even); 
= 2a+b, (p = 2, odd). 
It is plain that c’ is an integer in all cases, and that the form 


(96) g lap, b, 


is of discriminant d. Hence, in case (95;), p|c’ if and only if p?|d, so that g is 
primitive if and only if p? does not divide d. In either of cases (952) or (95s), 
the divisor of g is seen to be 1 if d=8 or 12, but 2 if d=0 or 4 (mod 16). 

In the respective three cases, write 


X = (x + Aby)/p, Xo = (xo + Abyo)/p; X = 3x, Xo = 
X = (x— y)/2, Xo = (xo — yo)/2. 


We find, by (90) and the values of b’ and c’, that the relations (20) are equiva- 
lent to 


(97) 


X = — b’u)Xo — c'uyo, 

y = apuXo + + b’u) 

Thus, if g is primitive, that is, if (6) does not hold, the (1, 1) correspondence 
between representations set up by (92) or (93) carries over to sets of represen- 
tations. Hence, if f= [a, b, c], 


(99) S(pn) = g(n). 


(98) 


507 


508 GORDON PALL [April 


4.3. Finally let (6) hold. Now (92) or (93) still sets up a (1, 1) correspon- 
dence between the solutions (x, y) of (91) and the solutions (X, y) of 


(100) aX? + (b'/p)Xy + (c'/p)y* = n/p. 


But now, while (x, y)~(xo, yo) implies (X, y)~(Xo, yo), the converse is no 
longer true. The former fact is evident from (98) on replacing ap by a, b’ 
by b’/p, u by pu, c’ by c’/p, and on noticing that #?—(d/p*) (pu)? =4. 

In the cases where T = +2, U =O are the only solutions of 


(101) T? — (d/p*)U? = 4, 
we evidently have conversely that 
(102) (X, y) ~ (Xo, yo) implies (x, y) ~ (0, yo), 


the first equivalence relating to (100), the second to (91). If d= —3p? or 
—4p*, the values of o in (14) are evident from the relative numbers of solu- 
tions of (20) and (101) (which are the numbers of solutions in a set). 

There remains only the case where d is positive but not square. Then 
(X, y)~(Xo, yo) may be written as 


2aX + (b'/p)y = + (b'/p) yo) + 3(d/p*) 
y = 3U(2aX0 + (b'/p)y0) + 


where (7, U) denotes some solution of (101). 

Our remaining problem may now be stated precisely. We are given a pair 
of integers xo, yo such that Xo, as defined in (97), is an integer. Let K denote 
the aggregate of all (integer) pairs x, y defined by (103), and (97), as T, U 
range over all the solutions of (101). Each such pair x, y is a solution of (91). 
Evidently K is the sum of a certain number of sets of such solutions. The 
problem is to determine that number. 

Let then x’, y’ be another pair, defined by 


2aX’ + (b'/p)y’ = + + 3(d/p*)U' yo, 


(103) 


(104) 

y’ = 3U"(2aXo + (b'/p) yo) + 3T'¥0, 
and by 
(105) = + Aby’)/p, X’ = 3x’, = (x’ — y’)/2, 


respectively; JT’, U’ being another solution of (101). Changing (x, y) to 
(—x, —y) or (x’, y’) to (—x’, —y’) if necessary, we may suppose (T, U) 
=(T,, U,) and (T’, U’) =(T,, U.), where T;, U; play the same role for (101) 
as ty, u, for ??—du? =4. 

Let o denote the least positive index such that U,=0 (mod p). Hence 
tn =Une/p and th=Tno (n=0, +1, +2,---). 


1933] THE NUMBER OF REPRESENTATIONS FUNCTION 509 


On solving for 2aXo+(b’/p)yo and yo from (104) and substituting in (103), 
and using relations for T,, U; analogous to (58) and (59;), we obtain 


2aX + (6'/p)y = + (6'/p) 9’) + 
y = + + 


Now, in all three cases p[2aX+(b’/p)y]=2ax+by. Hence (106) may be 
written 


(106) 


2ax + by = + by’) + 
y = 4(U o/p)(2ax’ + by’) + 47-19’. 


Now since (2ax’+by’)*—dy’?+0, these equations cannot hold with Ty_: 
and U;,_,/p replaced by any other numbers. Hence we see that (x, y) and 
(x’, y’) defined by (103) and (104) with (7, U)=(T:, Ux) and (T’, U’) 
=(T,, U;) are equivalent if and only if k=/ (mod ¢). This proves (13) with 
o as in (16). 


UNIVERSITY, 
MOonrTREAL, CANADA 


(107) 


ON ANALYTICAL COMPLEXES* 


BY 
S. LEFSCHETZ anp J. H. C. WHITEHEAD 


1. In his Colloquium Lecturesf one of us outlined a proof of an important 
theorem regarding the covering of analytic loci by complexes. A proof for 
algebraic varieties had previously been given by B. van der Waerdent and 
B. O. Koopman and A. B. Brown§ have recently proved the theorem for 
analytic loci. The object of this paper is to give a detailed proof along the 
lines indicated in Topology. 

2. We begin with certain general observations|| concerning the nature of a 
configuration £ (at first complex) represented by an analytic system 


(2.1) Xn) = F(x) = 0 (hk = 1,2,---,7), 


in the vicinity of a given point O of £ which we take as the origin throughout 
for the complex euclidean space S, containing &. There is a neighborhood of 
O relative to & consisting of a finite number of algebroid elements, any one 
of them, say w,, having about its center O, in a suitable coordinate system 
ys, 2 canonical representation 


(2.2) aH 
(b) —— + = 0, 
where H, G; are pseudopolynomials in yp41, i.e. polynomials with coefficients 
analytic in yi, - -- , yp at (y)=(0), and where moreover H is algebraically 


irreducible and special, i.e. its leading coefficient is unity and its other coef- 
ficients are zero at (0). p is the complex dimension of w, (dim w,), and also 
of — at O (dimo £) when dim w= for some w component of ~ at O, and <p 
for all others. When O is not on £ we agree to take dimoé = —1. 

We have the following basic irreducibility property: if — does not contain 
W,, then the intersection £-w, is a £, whose dimension < p at O. For the case 


* Presented to the Society, August 31, 1932; received by the editors in July, 1932, and (revised) 
September 22, 1932. 

¢ S. Lefschetz, Topology, Colloquium Publications, vol. XII, New York, 1930, p. 364. Except 
as introduced here the same notation and terminology will be used as in Topology. 

t Mathematische Annalen, vol. 102, pp. 337-362. 

§ These Transactions, vol. 34 (1932), pp. 231-252. 

|| Based on Osgood’s Lehrbuch der Funktionentheorie, vol. 11, chapter II. 


510 


‘ 


ANALYTICAL COMPLEXES $11 


where & is defined by a single relation (2.1) see Osgood’s proof (loc. cit., 
p. 133), and the extension to any & is obvious. 

We shall now recall a series of properties most of them direct consequences 
of the preceding. 

I. The solution of an infinite system (2.1) about any point O is of the 
same type as for a finite system. 

II. A point of w, is singular if the rank of the Jacobian matrix J of (2.2) 
is <n—p at the point; it is an ordinary point otherwise. The locus a of the 
singular points is the singular locus of w,. Since J contains a minor of order 
n—p equal to (@H/dyp41)"-? 40 when H =0, the conditions that the rank 
be <n—> define a é not containing w,. Hence a-w,=a isa é and dimoa<p. 

The characteristic property of an ordinary point (a) is to have relative to 
w, a neighborhood which is a 2f-cell Z,, with a parametric representation 


(2.3) Xi — = , Up), 


where at (w~) =(0) the ¢’s are analytic, vanish and have a Jacobian matrix of 
rank p. Every point of w, is a limit-point of ordinary points. 

III. It is impossible to decompose w, about O into a sum of r >1 sets wi,. 
For otherwise w'=w‘-w,#w,, hence gi<p. Therefore w, would have points 
about which the coordinates depend upon gq; < # parameters, which is untrue. 
As a noteworthy consequence the resolution of £ into w components about 
O is unique and hence dimo & depends solely upon O and &. 


IV. Given a fixed coordinate system x; we shall call vertical the direction 
of its x, axis and denote by P(A) the projection of the locus \ on x, =0. If 
the center O of w, is an isolated intersection with the vertical through O, then 
P(w,) is a w, of center P(O). This does not require that the coordinates x; 
be canonical for w,. We may of course assume that O is the origin so that 
P(O) =O. Under the assumption w, may be represented by a system (2.1) 
such that no F,(0,---, 0, x,)=0, hence we may replace all the F’s by 
pseudopolynomials in x,. The algebraic elimination of x, yields then a system 
such as (2.1) without x,, representing P(w,); hence P(w,) is a é. If this £ 
had r>1 components about O the vertical cylinders erected on them would 
decompose w, into a & having at least r components w about O. Therefore 
r=1 and P(w,) is a w of center O. If a point Q varies on wy, x,(Q) is a finite- 
valued function of P(Q), hence P(Q) depends on p parameters and P(w,) 
is a Wp. ‘ 

Since «x, is a finite-valued function on P(w,) we have for w, a representa- 
tion (Osgood, loc. cit., p. 114) 


(a) 


2.4 
(2.4) (b) 


Gi(x1, Xn—1) = 0, 
, Xn) = 0, 


512 S. LEFSCHETZ AND J. H. C. WHITEHEAD [April 


where H is a pseudopolynomial in x, and (2.4a) represents P(w,) in x,=0. 
Since no true subset of w, is a wy, H is irreducible. 

The branch locus 8 of w, is its intersection with 0H/dx,=0. Just as for 
the singular locus we have dimof < p. Hence w, possesses ordinary points not 
on 

3. Weshall now consider a real analytic variety 7. It is a real locus repre- 
sented by a real system (2.1) or system with F’s all real.* The same system 
represents a £ to be denoted by (n). Let O be a point of 7. On following up 
Osgood’s resolution of (y) into w components about O we find that their 
canonical coordinates y; may be chosen real. This being assumed done we 
have for a component w, three possibilities: (a) The canonical system of w, 
is real and w, possesses real ordinary points. The real subset of w, (real alge- 
broid element) will be denoted by v,, so that w, =(v,); incidentally the form 
of (2.2) shows that when p=0, O is an ordinary point, i.e., it is a v9. (b) This 
case is the same as the preceding except that the real points of w, are all 
singular. (c) The canonical system of w, cannot be chosen real. When w,=w,, 
H and G; in (2.2) may be replaced by H+H, G;+G;, both real and of the 
same form, hence we have cases (a) or (b). Therefore in case (c) necessarily 
Wp. 

We shall now show that 7 may be decomposed about O into a finite sum 
of v’s. Let p=dimo(m). Since the required result holds when p=0 we use 
induction on p. The real points of a w, of type (b) are on the singular locus of 
wW, which is an (n) whose dimension at O is <>. As regards the real points of a 
w, of type (c), let f;=0 be the canonical equations of wy. Since (2.1) is real, 
f;=0 are the canonical equations of another component of (7) which is @». 
Hence the real points in question are on w,-@, and since w, does not contain 
W,, this is a whose dimension at Ois <p. But this £, being represented by the 
real system f;+f;=0, —i(f;—f;) =0, is also an (n). The real points of com- 
ponents not of type (a) being thus on varieties (7) whose dimensions at O are 
<>, the required result is a consequence of the hypothesis of the induction. 

The meaning of dim v,, dimoy is as before. As it happens they are precisely 
the Urysohn-Menger dimensions, but this does not matter for our purpose. 

The irreducibility property holds for 7: if » does not contain vy, dimon-v, 
<p. Its proof is as follows. Under the hypothesis (7) does not contain (v,), 
hence p>dimo(n-v,) =dimon-?p. 

Properties I, - - - , IV hold with v in place of w and with these modifica- 
tions: (a) (2.3) represents a real analytic Z,; (b) (2.4) still represents v, in the 


* The condition F(2)= F(z) defines an analytic function F, the conjugate of F, and F is real 
whenever F=F. The set of the conjugate points of the points of a locus \ will be denoted by X, the 
usual “bar” notation being reserved for the closure. 


1933] ANALYTICAL COMPLEXES $13 


real and (v,) in the complex domains, but (2.4a) represents a real v{ which 
may be ~P(z,), since in addition it may contain points which are the pro- 
jections of pairs of conjugate points of (v,). Thus we can only assert that 
P(v,) is a subset of a v7. Here again v, contains an ordinary point Q not on 
the branch locus f. Q possesses then relative to v, a neighborhood which is an 
analytic E, homeomorphic with P(Z,). This implies that in (2.3) the Ja- 
cobian matrix of $;, --- , is of rank p at (wu) =(0). Hence P(Q) has a 
neighborhood relative to P(v,), and not merely relative to vj, which is an 
analytic E,. We may think of P(Q) as an ordinary point of P(,). 

Henceforth we shall deal exclusively with the real domain. 

4, The segments on 2,. Let a, be direction cosines for S,, so that (a) is a 
point of the unit-sphere H,_, of S,. A point (x) of v, (p<) will not be an 
isolated intersection of v, with the line x,+ sa, (s variable) when and only 
when the MacLaurin series for s of the functions f;(x-+sa) are=0, where the 
f’s are the left-hand sides of a representation (2.1) for v,. There results a real 
analytic system 


(4.1) @;(x; a) = 0. 
Its solutions for (x; a) in the vicinity of any solution (x°; a°) make up a 
finite number of sets 7,. On such a 2, we shall then have a parametric repre- 
sentation 
(4.2) (a) = (b) as 
where ¢;, ¥; are analytic on 2,. The system (4.2b) represents on H,_; the 
directions near (a°) corresponding to segments on our given 2, associated with 
(4.2). 

Since (4.2) represents a 2,, 


2, 


dy; Oy; 
is of rank g at some points as near as we please to (y®). On the other hand, 
for arbitrary but small, ¢;+sy; represents 
a point of our initial v,, and hence among these functions at most p are func- 
tionally independent, or 
09g; 
(4.4) | - 
is of rank <#, and this must hold for s small but arbitrary. Now any deter- 
minant of this matrix containing s is a polynomial in s whose leading coeffi- 

cient is the corresponding determinant of 


OY; 
Oy; 


514 S. LEFSCHETZ AND J. H. C. WHITEHEAD 
|= 
Oy; 


which must therefore be of rank < p. Owing to the relations 


(4.5) 


the new matrix may be bordered with a row 0, - - - , 0, 1 without changing 
its rank. It follows that the rank of 


OY; 


is at most p—1<n-—2. Therefore the directions of segments meeting 2, in an 
infinite set are represented on H,_; by a variety 7 whose dimension at any 
point <w—1, and hence they are nowhere dense on the sphere.t 

5. Analytic complexes. By an analytic structure £ we shall mean a real 
point set in a real S, which constitutes a topological space with varieties 7 
as the neighborhoods. Each point Q of ¢ has then a neighborhood relative to 
¢ made up of a finite set v,,', - - - , ¥,", where the v’s have Q as their common 
center. The largest g, is the dimension of ¢ at Q (dime), and the largest value 
p of dimef for Q on ¢ is the dimension of {(dim £) which is then designated by 

We now define the point Q of £, whose neighborhood is7,,1+ - - -+2,", as 
singular when r>1, or when r=1 and dimef<f, or else dimef =p and Q is 
singular for its unique 7. From property II of §2 for an 7, we have that the 
set of all singular points or singular locus is a [,, r<p. A point of {,—¢, is 
an ordinary point of £,. Its characteristic property is that it possesses relative 
to ¢, a neighborhood which is an analytic E,. 

By an analytic p-element, or merely p-element, €,, we shall mean a rela- 
tively open subset of a structure ¢,, containing at least one ordinary point, 
and such that €,¢¢,. Under these conditions we shall describe €, and ¢, as 
associated with each other. By an analytic p-complex, xp, we shall mean a 
finite set of non-intersecting elements ¢€, of dimension up to and including p, 
which constitute a closed bounded point set in S,. By convention the empty 
set is to be a ¢_1, an €_, or a k_;. We shall write F(¢) =¢—¢. We do not consider 
here infinite «’s, since they may be taken care of as in Topology. 

The intersection of two or more ¢’s or e’s is respectively a ¢ or an e. If 
are complexes, so is x-x* =) e-e*. Similarly when «-¢ is 
closed then x-{ is the complex 


¢ Cf. Koopman and Brown, loc. cit., p. 242. 


[April 
Oy; 
? =1, 


1933] ANALYTICAL COMPLEXES 515 


A complex x’ will be called a subdivision of x if the two coincide as point 
sets and if each element of x’ is contained in one of x. If x* is any complex on x 
it is clear from the preceding paragraph that «x has a subdivision with one of x* 
as a subcomplex. It follows that x+.«* can be covered by a complex having 
subdivisions of x and x* as subcomplexes. For x and x* can each be subdivided 
to form a complex having a common subcomplex covering x- x.* 

Whenever throughout x we have e,- €,=0 for g< >, x is said to be normal. 
When «x is normal, it remains closed, and hence a x (moreover a normal x) 
when one or more p-elements are removed from it. 

Every complex has a normal subdivision. Given any £,, we shall denote its 
singular locus by (p’ <p). Let then €, be two elements of x, and {,, 
associated structures. We shall first show that there exists a {, > €,-€, such 
that and that the distance from e€,-€, to F(f,) >0. In any case €,-€,¢ 
Also F(f,) ¢ Since no € meets its F(f), 
does not meet F(¢,), and as €,- €, is self-compact the distance of the two sets 
>0. Therefore when s <p we may take ¢, =,. 

Let now s= and let Q be a point of ¢,-€, not on {,, so that Ocf,—{£,, 
and dimef, =~. This implies g=~/ and that the nieghborhoods of Q relative 
to ¢, and ¢, have a common 2, which is then wholly on ¢, near Q. In that 
case necessarily Q ¢f{,,. For otherwise v, would be a complete neighborhood 
of Q relative to £,, hence it would contain points of ¢, infinitely near Q, and 
we should have e,- €,~#0, which is ruled out. It follows €,-€,¢f+£.. 

Since a singular locus is closed relative to its ¢, and since F(f,) ¢ F({,) 
+F(¢,), we find that satisfies the condition for a 
structure, with F(¢,) ¢ F(¢,)+F(¢,). Since the last two F’s do not meet 
€p*€, this is likewise the case as regards F({,), which implies also ¢, > €,- € 

> €,:€,, and that the distance condition holds. Since r=’ or g’, both <9, 
¢, has all the properties that we require. 

We can find a closed polyhedral neighborhood of €,-€, not meeting F(¢,), 
and its intersection with ¢, is a x,. The sum of these complexes for all p-ele- 
ments is a <p, and ef = €,—x; is an element. Replacing by together 
with the sum of the elements €,-«,, we obtain a subdivision x, , such that 
-€, =O if ef Hence x{ is a« whose dimension <p. The re- 
quired result follows then by induction on p. 

6. The covering theorem. Let x be any complex and let “vertical” 
direction or projection have the same meaning as in §2, IV. Every point of x 
has a neighborhood relative to x made up of a finite number of v’s. Since x 
is self-compact it can be covered with a finite number of v’s. It follows then 
from §4 that the axes may be so chosen that no vertical meets x in an infinite 
set. 


516 S. LEFSCHETZ AND J. H. C. WHITEHEAD [April 


From §§3, 5, we conclude that « has a subdivision (obtained as its inter- 
section with a suitable polyhedral complex) whose elements are each on a v 
represented by a system (2.4). The subdivision can then be normalized so 
that at each step in the process the preceding property is preserved. Ulti- 
mately we turn the complex into a normal complex, still called x, whose ele- 
ments all have the property just described. 

Let e, be any element of our new x with its ¢, given by (2.4). The branch 
locus {* is of dimension <p. We verify at once that e* = «,-¢* and e,— e* are 
elements with {* and ¢, as associated structures. Referring to the end of §3 we 
find also that, when p <n, P(€,—*) is an €,. Moreover when (x, - - - , Xn_1) 
ranges over P(e,—*), to certain real roots x, of H =0 there will correspond 
points (x1, - - , ) which generate elements whose sum is €,— e*. 

Assuming that our complex is a x», p<, we decompose every e, of x, in 
the set of p-elements €, plus e* (whose dimension </), and repeat the opera- 
tion for the elements of next lower dimension of the new complex, etc. Ulti- 
mately then we have in place of x, a new normal complex, still to be called 
Kp, such that every ¢ of x, has for projection P(e) an element, and « is rep- 
resented by an analytic relation x,=f(Q), Qe P(e) (analytic home- 
omorphism). 

A final subdivision x =)>e’ of x, will now be made, such that P(x, ) 
can be covered with a x/’ =) e’’ having the property that every P(e’) is an 
exact sum of elements e’’. For p =0 this is trivial, hence we use induction on p. 
Taking x, in the reduced form just obtained, xg=xp—)_€, is a complex with 
q<p. Under the hypothesis of the induction it has a subdivision x/ =)>¢’ 
of the desired type. Let 6 be a positive number such that —6<x,<6 on kp, 
and let C(x/) be the (¢+1)-complex whose elements are the parts of the 
vertical cylinders based on the e’’s lying between the spaces x,= +64, to- 
gether with their intersections with these spaces. Let ¢, be an element of 
the original x,. Since an ¢, carries no vertical segment, the intersection 
e--C(xj) consists of elements of dimension <q, some being of dimension g 
when r=g. Therefore x,-C(x/) is a g-complex, and since g<f, it has a sub- 
division «,*’ such that P(x,*’) is covered by a x;’ of the required type. 
Given any of x, we form a new element =¢,—x,*’. Then xy =x,*’ 
+>°e/ is the required subdivision of x». For let x; contain m p-elements 
and let n»*=P(e/*). When m=1, we can take =x/’ +n’. Therefore 
we may use induction on m. Removing e’™ from x, we have a complex 
k,*’ which, under the hypothesis of the induction, possesses an as- 
sociated x,*’’ covering P(x,*’). Now 7’ ="—x«,*” is also an element and 
kp’ =n’ +x,*’’ is a covering of ) such as we are seeking. 

Observe that every ¢’ is still analytically homeomorphic with its projec- 


1933] ANALYTICAL COMPLEXES 517 


tion P(e’) since this holds as regards the ¢ of x, on which it lies. We are now 
ready for the 


THEOREM. Every analytic complex has a simplicial subdivision. 


We first assume p <n, and x, in its ultimate reduced form, «x, , x7’ having 
the same meaning as above. The theorem being trivial for 7 =0 we use induc- 
tion on n. Since xy’ ison an S,_; it has thenasimplicial subdivision 
Let QQ’ be any vertical with Q ¢ g,. It intersects x/ in points Q', Q?, - - -,Qr 
(r finite) each on a different ¢’ of xf , say Q‘ c e’*. When Q ranges continuously 
over o,, by the above Q‘ remains on e’‘ and generates a homeomorph of o,, 
a cell Ei c e’‘, and no two of these cells intersect. As a consequence K, =) E; 
is a cellular subdivision of xf and hence of x,y. We shall now show that the 
homeomorphism between E, and E} can be extended to their boundaries. 
This merely requires that we prove that when Q c F(£,), it has a unique image 
on F (E}). Suppose that it has s images Q’i. We may choose for each Q’i a 
neighborhood relative to EZ,‘ consisting of a cell Z,‘i whose projection is a 
simplex o (in the straightness of o,), no two of the cells Z,*/ intersecting. As 
a consequence o/-0*=0 (jh) and Q has for neighborhood relative to o, 
a set of s non-intersecting g-simplexes, which can only be if s =1, as asserted. 
Since E,‘ and &, are homeomorphic, £’ is simplicial and so is Ky. 

If we have a x,, on removing its n-elements we have a ky, p<, which we 
identify with the x, just considered. When Q ¢a,, the points of x. projected 
on Q may include some of the segments Q‘Q*+! and we observe that, since r 


is fixed throughout any g, if the segment is zero anywhere on a face of a, 
it is zero throughout that face. As a consequence we find by an elementary 
induction that when Q ranges over a, the segments ~0 generate (¢+1)-cells 


E, +1 whose structure is that of a truncated simplicial prism. Since these cells 
are convex, the covering K,, thus obtained for x, is convex, and its first de- 


rived, which is simplicial, answers the question. 

Corotiary. If Kp, Kp has a simplicial subdivision with a subcomplex 
COVErING Kg. 

For x, has a subdivision xj having a subdivision of x, as a subcomplex. 
In particular x, may be a closed polyhedral region of S, containing x,. This is 
substantially the theorem of Topology, p. 364. 


PRINCETON UNIVERSITY, 
PRINCETON, N. J. 


4 


NON-CONJUGATE OSCULATING QUADRICS 
OF A CURVE ON A SURFACE* 


BY 
R. C. BULLOCK 


1. Introduction. This paper is concerned with making a study of the 
projective differential geometry of a non-conjugate net of curves on a surface 
in three-space by means of a pair of osculating quadrics defined in the follow- 
ing manner. Consider a curve C on the surface S. At a point P of C and at two 
neighboring points P;, P, on C construct the tangents of the curves of one 
family of the non-conjugate net. The limit of the quadric surface determined 
by these three lines as the points P, P; approach P along C is a non-conjugate 
osculating quadric at the point P on C. The other osculating quadric is ob- 
tained in a similar manner by drawing tangents to the other family of the 
net. Thus we associate with each point of the surface a pair of osculating 
quadrics analogous to the asymptotic osculating quadricst of Bompiani and 
Klobouéek and the conjugate osculating quadricst of Lane. 

We compute the equations of the osculating quadrics and note some re- 
sults that follow rather immediately therefrom. The chief contribution of the 
paper is a complete discussion of the nature of the curve of intersection of the 
two osculating quadrics at a point of a curve on the surface. We also study 
the curve of intersection of corresponding osculating quadrics for two curves 
which are respectively members of two families of curves which form a 
conjugate net on the surface. As a by-product we obtain a new necessary and 
sufficient condition that a net of curves on the surface be a conjugate net. 

2. Analytic basis. In this section we set up an analytic basis for the study 
of a non-conjugate net of curves on a surface in projective three-space follow- 
ing Green’s method.§ We shall also list in this section certain of Green’s 
results that we shall need for later reference. Let it be noted here that we 
shall assume that the surface sustaining the net is not developable, and that 
unless otherwise specifically stated the net in question is not the asymptotic 
net. 


* Presented to the Society, December 29, 1932; received by the editors August 25, 1932. 

1 E. P. Lane, Projective Differential Geometry of Curves and Surfaces, University of Chicago 
Press, 1932, p. 80. 

} E. P. Lane, Conjugate nets and the lines of curvature, American Journal of Mathematics, vol. 53 


(1931), p. 577. 
§ G. M. Green, Nets of space curves, these Transactions, vol. 21 (1920), pp. 207-236. 


518 


NON-CONJUGATE OSCULATING QUADRICS 519 


Let the surface under consideration be the analytic surface whose vector 
equation in homogeneous coordinates is 


x = x(u, v). 


A necessary and sufficient condition that the parametric curves “=const., 
v=const. shall not form a conjugate net is that x shall not satisfy an equation 
of Laplace of the form 

AXuy + bx, + cx, + dx = 0. 


This is equivalent to saying that the fourth-order determinant 
W = (Xuv, Xu, Xv, X) 


does not vanish. Hence the functions x are solutions of a pair of partial dif- 
ferential equations of the form 


(1) Luu = px + aAXyu + Bx, + Lites, Xov = GX + Xu + bx, + Naes- 


If we adjoin certain integrability conditions, which we shall not write down, 
these equations will form a completely integrable system; that is, any funda- 
mental set of solutions can be expressed as a linear combination, with con- 
stant coefficients, of any other fundamental system. We shall base our pro- 
jective theory of surfaces on this completely integrable system of partial 
differential equations. 

If we regard the w-tangent at a point P, as a generator of the congruence 
of u-tangents, its other focal point is given by 


(2) p = Xu + (6/L)x. 


Similarly the focal point of the v-tangent at P, is given by 
(3) o = xy + (7/N)x. 


The line joining p and ¢ is called by Green the ray of the point P,. 

Since the parametric net is not asymptotic the osculating planes of the 
u-curve and v-curve through P, determine a line that passes through P, but 
does not lie in the tangent plane. This line is called by Green the axis of the 
point P,. The totality of axes of points of the surface generate the axis con- 
gruence. The point r which is the harmonic conjugate of the point x with 
respect to the two focal points of the axis is given by 


(4) t = K'x + (y/N) xu + + 
where the value of K’ is given by 
K’ = — 3[b@) + — — (y/N)a®? + (ay)/N + (88)/L 
(S) — Ly?/N* — NB?/L* — 2By/(LN) + (Nyu — Nuy)/N? 
+ — L.8)/L*], 


¥ 

4 

i 


520 R. C. BULLOCK 


wherein 
a@) = [L, +a+BN+L(N.+ yL + — LY), 


= [V(L, +a+6N) +Nut +5]/(i — LN), 

= + By + L(ay +yut+QI//(1 — LN), 

= [N(B5 + By + p) + By + — LN). 

The points p, o, r are covariant points with respect to all transformations 
of proportionality factor and independent variable in equations (1) that 


preserve the net. 
The curvilinear differential equation defining the asymptotic net on the 


surface is 

(7) Ldu? + 2dudv + Ndv* = 0. 

Evidently L = N =0 is a necessary and sufficient condition that the para- 
metric net be asymptotic. We also see that the surface is developable if, and 
only if, i-ZLN =0. 

In order that the net defined by the equation 

(dv — ddu)(dv — pdu) = 0 (A ¥ 
shall be a conjugate net the two directions \ and uw must separate harmonically 
the asymptotic directions. A necessary and sufficient condition for this is 
(8) w= —(L+2)/(N + 1). 

The conjugate net whose directions separate harmonically the para- 
metric directions is called by Green the associate conjugate net, and it is de- 
fined by the following equation: 

(9) Ldu? — Ndv? = 0. 
We list for reference the following invariants due to Green and Grove:* 
r=L+h, s=MA+41, 
t=IN r= — bw, 
= q + bi(6 — bi) — biv + 
Le, 
B = Ne — }(a“ — 5), 
= Lh, — — a), 
G = + ca@) — — 
G’ = + — — bf, 
+ —  — Ne* — 2¢,, 
F’ = + — bya — Lb? — 
W™ =F — NG, 
* V. G. Grove, A theory of a general xet on a surface, these Transactions, vol. 28 (1926), p. 496. 


(6) 


1933] NON-CONJUGATE OSCULATING QUADRICS 521 
wherein a), are given by (6) and where 
a 1/L, b= -—a/L, c=-P/L, = — p/L, 
a, = 1/N, b, = — a = —J/N, d, = —q@/N, 
[N(a, + By) tay — LN), 
= [86 + p + L(By + 6.) — LN). 


p12) 


3. The non-conjugate osculating quadrics. We now compute the equa- 
tions of the quadrics defined in §1. Let us consider a non-conjugate net N on 
a surface S of the kind specified in §2, and suppose S is an integral surface of 
the system (1). On this surface let us consider a curve C, which is a member 
of the family defined by the equation 


(11) dv — d\du = 0, 


and suppose this family of curves is not conjugate to either family of the net 
N. We shall first find the equation of the osculating quadric Q,, determined by 
the tangents to three consecutive w-curves at a point P, of C,. The equation 
will first be referred to the local tetrahedron of reference whose vertices are 
the points x, xu, %», Xu». We may define any point X on C, near P, by the 
following power series in Au: 


x” Au? dx 
X=a+ (» -=). 


By making use of the differential equations (1) and expressions obtained from 
them by differentiation this series can be expressed in the form 


where, for a suitably chosen unit point, %1, x2, x3, x are local coordinates of 


the point X referred to the local tetrahedron x, x., %,, Xu», and the power 
series defining them sufficiently far for the purposes of this paper are 


=1+---, 
= Au+-::-, 

xs = + (8 +X’ + + ---, 
= (L + 20+ Nd*)Aw?/2+---. 


(12) 


i 

j 


522 R. C. BULLOCK [April 


Similarly for a point X, on the u-tangent near P, we get the local power 
series expansions 
yi = 
yo=1+adut+---, 
ys = BAu + + + (L + 2r)c@” 
+ |Au?/2 
y= (L+r)dut+ [V+ Ll. +68 + aL 
+ (L + + |Au?/2 tere, 
Now let us set 
= (B +0! + 0%)/2, 
= (L + 24+ Nd*)/2, 
es = [Bu + + (L + 4 
8+ aL + (L+ + ]/2. 


Any point on the line XX, is given by the linear combination 


(13) 


mX + nX, (m, n scalars). 


On making use of equations (12) and (13) the power series expansions for the 
local coordinates of the point are found to be 

2, => 

Zz 

Z3 = (Am + Bn)Au + (eym + egn)Au? +---, 

zy = + (egm + en)Au? +.---. 


(14) 


Now demand that the power series (14) satisfy the most general equation of 
a quadric surface identically in m, m and identically in Aw as far as terms of 
the second degree. The computation involved in doing this is rather labor- 
ious, and the details will be omitted here. The equation of the quadric Q, is 
found to be 
(15) (L — Nd?*)x? + + + — rxex3 + = 0, 
where 7 was given in equations (10) and R and S’ are given by 

R = — Lbi) + — Ne) + — 28 — r)’], 

S’ = (6? + dB’ — — pr? — BN)/r + Al[c@Y + Aco], 

The equation of the osculating quadric Q, may be obtained immediately 

from that of Q, by performing in equation (15) the following substitution: 


1933] NON-CONJUGATE OSCULATING QUADRICS 


“ux L 

(16) ( 2 a B p ) 
v 6 g N 

The resulting equation of Q, is 

(17) (L— Nd?*)x? + Proxy + Ox? — — + = 0, 


where s was given in equations (10) and P and Q are defined by 


P = + (log s)’ — — Lbi) — 2X(B — No), 
Q = — + — y(log d)’ — — (62 + 


If we specialize our net to be the asymptotic net by putting L=N =0, 
equations (15) and (17) reduce to the equations for the asymptotic osculating 
quadrics of Bompiani and Klobouéek. 

4. A covariant tetrahedron of reference. The tetrahedron to which the 
equations of the osculating quadrics were referred in the last section was not 
a covariant tetrahedron, and hence we were unable to express the coefficients 
in the equations completely in terms of invariants. We find it advantageous 
to change our tetrahedron of reference to a new covariant tetrahedron whose 
vertices are the points x, p, , r, of which the last three were defined by equa- 
tions (2), (3), (4) of §2. The transformation that effects this change of refer- 
ence system is 


= yi + (B/L)y2 + (x/N) ys + 
= yo + 
+ (8/L) 
= M4, 
where the x’s are the old variables and the y’s are the new. After renaming the 


variables so that the x’s are the new local coordinates, the transformed equa- 
tions of Q, and Q, are respectively 


(L — Nd?*) xP + + + + AG — AF’ 
+ — AR/r) x? + — 2Zrxexs = 0, 

(L — Nd?*)x? — [248 + 2C’ — (log s)’] + (F — dG’ 
+ + — + 2sxex3 = 0. 


(18) 


The coefficients in the equations of the osculating quadrics are now ex- 
pressed completely in terms of invariants defined in (10). 
5. Some immediate results. In this section we establish some theorems 


523 
| 
i 


524 R. C. BULLOCK [April 


analogous to those of Lane for a conjugate net and in addition obtain an 
interesting reciprocity property. 

The osculating quadric Q, intersects the tangent plane, x,=0, in the u- 
tangent, x, =x,=0, and the residual line 


(19) = (L — Nd?*)x3 — = 0. 


This line is the v-tangent if, and only if, L—Nd*=0. Likewise Q, intersects 
the tangent plane in the v-tangent, «.=x,=0, and the residual line 


(20) xy = (L — + 2sx3 = 


On referring to equation (9) we conclude the following result: 

A necessary and sufficient condition that a curve on our surface belong to the 
associate conjugate net is that either of the two osculating quadrics have the tan- 
gents of the net as generators at every point of the curve. 

Now suppose L —N)?+~0. Then the two lines (19) and (20) coincide at 
every point of C) if, and only if, equation (7) holds. Thus we have the follow- 
ing theorem: 

The residual intersections of the osculating quadrics with the tangent plane 
coincide at every point of a curve if, and only if, the curve is a member of one fam- 
ily of the asymptotic net on the surface. The line of coincidence is the tangent to 
the curve. 

The coordinates & of the polar plane of any point x with respect to the 
osculating quadric Q, are given by 


= = — 2drxs, 
= — 2drxe + 2(L — 
(21) + + 2NC’ + t/r) xs, 


2r2x1 + + t/r) x3 
+ — AF’ + Aw — AR/r) 


The polar plane of the point p, whose local coordinates are (0, 1, 0, 0), 
with respect to Q, is the plane x;=0, which is the osculating plane to the 
u-curve C, at P,. But we see from the first of equations (18) that p lies on Q,, 
hence we conclude that the osculating plane to the curve C,, at a point of a curve 
Cy ts tangent to the corresponding osculating quadric Q, at the point p. In a 
similar manner it may be shown that the osculating plane to C, at P, is tangent 
to Q, at the point oc. 

By referring to (21) we see that the polar plane of the point o with respect 
to Q, passes through the point P., and it passes also through the point r if, 
and only if, 


1933] NON-CONJUGATE OSCULATING QUADRICS 
(22) 2n°B + 2G’ + t/r = 0. 


We readily draw the following conclusion: 

Condition (22) is necessary and sufficient for the ray and axis to be re- 
ciprocal polar lines with respect to the quadric Q,. 

Similar results hold for the quadric Q,. 

6. Nature of the intersection of the osculating quadrics. We now make a 
complete study of the nature of the curve of intersection of the osculating 
quadrics Q, and Q,. This study is made by means of the elementary divisors 
of the matrix of the pencil of quadrics based on Q, and Q,. The method is to 
compute the elementary divisors, write down the characteristic, and ascer- 
tain the general nature of the intersection by referring to results tabulated 
by Snyder and Sisam.* 

For the sake of brevity let us write equations (18) for the osculating quad- 
rics Q, and Q, in the following form: 


Ax? + 2Bx3x%q + Cx? + 2d? x1%4 — = 0, 


23 
(23) Ax? — 2Dxexy + Ex? — 2x1%4 + 2sxex3 = 0, 


where A, B, C, D, £ are new notations introduced here, and their values may 
be readily read off from (18). We now give the results obtained for the va- 
rious cases that arise in our problem. 

Case I. If A¥0, B—D\ +0, L+2A+N)?+0, we find that the charac- 
teristic is [13]. Hence the intersection of the quadrics Q, and Q, at each point 
of a curve C, satisfying the conditions stated is a quartic curve with a cusp. 
But if B—Dd=0 we get by using the values of B and D 


(24) t/r + (log s)’ = 0, 


and 


t/r = (LN — L’)/(L +2) =  — Allog (L +d)’. 
Substituting this in (24), dividing by \, and integrating we get 
MWA + 1)/(L + A) = (e = const.). 


Now by replacing \ by dv/du we see that if C, is a curve such that B—DA=0 
then C, belongs to the pencil of families of curves defined by the curvilinear 
differential equation 


(25) eLdu? + (e + 1)dudv + Ndv? = 0. 


But let us note that for e= —1 this becomes L—Nd?=0, and for e=+1 it 


* V. Snyder and C. H. Sisam, Analytic Geometry of Space, New York, 1914, p. 163. 


i} 

| 

at 
| 

i 

| 

4 

| 
| 


526 R. C. BULLOCK [April 


becomes L+2A+N2d?=0 after dividing through by du*. Thus the three con- 
ditions imposed at the beginning of this case are all implied by the second. 
Since in later work no other case arises in which the curve of intersection of 
the two quadrics is non-degenerate, we state the following theorem: 


The two osculating quadrics at each point of a curve on our net intersect in a 
non-com posite quartic curve if, and only if, the curve does not belong to the pencil 
of families of curves defined by equation (25). The quartic has a cusp at the point 
Fe 

In our general treatment no case arises in which at each point of a curve on 
our net the total intersection of the pair of osculating quadrics contains as a part 
of it a non-degenerate cubic. 

Case II (a). Suppose B—DA=0, A¥0, L+2A+N)d?+0, A(C+Ed?) 
—B’+0. We get the characteristic [1(21)], which shows that the curve of 
intersection is composed of two conics which touch each other. The conics 
touch at P, and lie in the planes whose equations are 


(26) AX(x3 — + [BA + — ACC + Ed*))"/?] a4 = 0. 
Case II (b). If in II(a) we have A(C+E)*) —B?=0, then the character- 


istic is [1(111)], and hence the intersection is a conic counted twice. The 
equations of this conic are found to be 


A(x3 — Axe) + Buy = 0, 
A(L + 2X + Nd*) x? + 2Brsxgxy + ACx? + 2AM = O. 


Case II (c). If L+2A+N\?=0, A(C+E)*) —B?0, the characteristic 
is [(22)], and the intersection is three generators, one counted twice. The 
generator that is counted twice is the line xy=x;—x2=0, which is tangent to the 
curve C, at P,. The other two generators are the lines whose equations are as fol- 
lows: 


(27) 


An(x3 — Axe) + [BX + (B? — A(C + Ed?*))"/2] x4 = 0, 


(28) 
+ 2d2x1) — — ACC + x3 = 0, 


where the upper signs in the two equations are paired to give one generator 
and the lower signs the other. 

Case II (d). If now in II(c) we let A(C+£)*) —B?=0, then the charac- 
teristic is [(211)], and the intersection is two intersecting generators, each 
counted twice. The equations of these generators are respectively 


= X%3 — Axe = 
A(x3 = + Bu, 


(29) 


+ 2BrASx3 + = 0. 


1933] NON-CONJUGATE OSCULATING QUADRICS 


From the above results we note the following theorem: 


At each point of an asymptotic curve on the net the intersection of the osculat- 
ing quadrics degenerates completely into generators. 


Case III. Now let A=0, B¥0. The characteristic is then [(31)], and 
the intersection is two intersecting generators, the tangents of our net at P,, 
and a residual conic. We thus obtain the following result by using (22): 


If the ray and axis are not reciprocal polar lines with respect to the osculating 
quadric Q.., and if the curve Cy belongs to the associate conjugate net (A =0), 
then the quadrics Q,, and Q, intersect in the tangents of the net and a residual non- 
degenerate conic. The equations of this conic are 


x2 2Bx3 om (C + En?) x4 = 0, 


30 
(30) CDx2 + [2BD — s(C + Ed?) — 2Bsx? + = 0. 


Case IV. If A=B=0, C+£)?+~0, we find that the characteristic is 
[(211)], and the intersection is composed of the tangents of the net counted i.vice. 
The quadrics touch at each point of each tangent. 

Case V. If dA=B=C+E)*=0, the characteristic is [(1111)], and the 
two quadrics Q, and Q, coincide. Their equations both become 


(31) Cx? + —_ 2rArxoX3 = 


This completes the discussion of all cases that arise. 

7. Conjugate nets. Let us now consider two curves C, and C, on our net, 
and let us suppose that they are respectively embedded in two conjugate 
one-parameter families of curves on the net. By equation (8) we see that a 
necessary and sufficient condition for this is 1 = —r/s. Let P, denote the point 
where C, crosses C,. We shall denote by Q,, the osculating quadric Q, of 
C, at P, and by Q., the osculating quadric Q, of C, at P,. The equation of 
Qu, may be obtained from that of Q,, simply by replacing \ by —r/s. The 
equations of the osculating quadrics Q,, and Q,, are thus as follows: 


Ax? + + Cx? + 2d? x14, — = 9, 


32 
( A(i LN) x? + 2H + Kx? + 2ar(1 LN) x2x3 = 0, 


where the first of these is the same as the first of equations (18) and H and 
K are defined by 
H = — + s?[L(rs’ — r’s) 
+ r(Lus — Lor)]/[2s(LN — 1)], 
K =r{ar — rF’ — sG — (rsk)/[MLN — 1)]}. 


527 | 
| 
| 

i 

| 
| 

} 

if 

i 

| 

| 


528 R. C. BULLOCK [April 


The cone projecting the curve of intersection of the two quadrics (32) from the 
second vertex of the tetrahedron of reference is given by 


2[B(1 — LN) — H]xsx, + — LN) — K] x? 


(33) + — LN) — = 0. 


This cone evidently consists of two planes, one of which is the tangent plane, 
x,=0. But the tangent plane intersects each of the quadrics (32) in the two 
lines 


(34) 


0, 
x, = Axs — = 0. 
Hence we draw the following conclusion: 


At any point on the net the osculating quadric Q.,y of a curve of one family of a 
conjugate net always intersects the corresponding quadric Q., of the curve of the 
other family in two conics, one of which is degenerate and is represented by equa- 
tions (34). The other conic lies in the plane 


(35) 2L(L + 2d + Nd?*) a1 — 2[B(1 — LN) — H]x3 — [C(1 — LN) — K]x = 0. 


Let us note that since C, and C, are assumed distinct it follows as a con- 
sequence that L+2A+Nd?0, and previous restrictions insure that LAr ~0. 
Now let us project the curve of intersection of the quadrics (32) from the 
first vertex, P., of the tetrahedron of reference. The equation of the projecting 
cone is 


A[r? — — LN) xs? + 2(r?B — + { + 22 


(36) + — LN)} x2 — — — LN) ]xoxs = 0. 


Evidently the lines (34) are generators of this cone; so this is really the cone 
projecting from P, the conic lying in the plane (35), which does not pass 
through P,. Hence the condition for the cone (36) to degenerate is just the 
condition for the residual conic to degenerate. But evidently W™ =0 is 
necessary and sufficient for this, and, as Green has shown, this is just the con- 
dition that insures that the u-tangents of the net form a W-congruence; that 
is, that the asymptotic curves on the two focal sheets of the congruence cor- 
respond. Hence we have proved the following theorem: 


A necessary and sufficient condition that the congruence of tangents to the 
curves C,, of the net be generators of a W-congruence is that the residual conic of 
intersection of the two quadrics Q., and Q,, degenerate at every point of the net. 


We have already seen that if the curve C, is embedded in a one-parameter 
family of curves that is conjugate to the family to which C, belongs, then the 


1933] NON-CONJUGATE OSCULATING QUADRICS 529 


osculating quadrics Q,, and Q,, each intersect the tangent plane, «,=0, in the 
lines (34); that is, in a w-tangent and a residual line. Now let us suppose that 
C, and C, are any two curves which are respectively members of two distinct 
one-parameter families of curves, neither of which coincides with either 
family of the net itself. Then the corresponding osculating quadrics Q, for 
these two curves ordinarily will intersect the tangent plane in the w-tangent, 
x3 =x,=0, and the two residual lines 


= (L Nd?) x3 2X(L + A) xe = 0, 


(37) 
= (L — Ny*)xs3 — + x2 = 0. 


Now if we demand that these two residual lines coincide we get u= —1r/s. 
Hence we draw the following conclusion: 


A necessary and sufficient condition that a net on our surface be a conjugate 
net is that the residual intersections (37) of the quadrics Qu, and Q., with the 
tangent plane coincide at every point of the net. 


8. Metric considerations. It is the purpose of this section to indicate a 
method of studying the analogue of the preceding problem in metric three- 
space and to give a few results that have been obtained by the author for the 
metric situation. Throughout this section we shall employ the notation and 
vector methods of Blaschke* for studying the geometry of a surface in ordi- 
nary space. 

The equations of the osculating quadrics at the point x of a curve on the 
surface are here referred to a local orthogonal cartesian coordinate system, 
which is defined by the convention that any point X given by an expression 
of the form 


(38) X — = + yox,/G'!? + 


where & is the unit normal at the point x, shall have local coordinates ,, 


2, Vs- 

If we now let the partial differential equations of Gauss and Weingarten 
play the role of equations (1) and compute the equation of Q, in a manner 
similar to that of §3, we find that the equation of Q, is 
(39) EVAL + SGE}?y?2 om 20G1!?-y1 V2 

+ 2PGy1y3 + 2S(EG)"!* yeys + = 0, 


where 


* W. Blaschke, V orlesungen iiber Differentialgeometrie, 3d edition, vol. 1, Berlin, 1930, pp. 85-118. 


| 

| 

| 

| 

| 
i 

| 

{ 

| 
i 
4 

\ 

i 

i 


530 R. C. BULLOCK [April 


+ — EM)/W*), 


The equation of the osculating quadric Q, may now be written down by 


symmetry, or it may be obtained by direct calculation. 
The discriminant A of the quadric (39) is 


A = 0, 


for E, G are positive on a real surface, \+0 since C, was assumed not a u- 
curve, and 00 since C, was supposed not a curve of the family conjugate to 
the family of u-curves of the net. Thus the osculating quadric Q, is always non- 
singular for a real analytic surface. 

If we simplify the analysis by taking the parametric net to be orthogonal 
we get the following interesting result. The determinant D’ of the matrix of 
the second degree terms in the resulting equation of the quadric Q, then be- 
comes 


(40) D! = + Ar + + + Asad), 
wherein 


Ay = — G(LG, + ME,)/2, 
A, = (4L°MG + 21GE,, + ME? — LEG, — 2L,E,G)/4, 


1933] NON-CONJUGATE OSCULATING QUADRICS 


Ag = (2GL.G, + 8L°NG + 4L1GE,, — 2LE,G, — MGE,E,/E + 4LM°G 

+ 2NE? + 2MGE,, — 2E,G, — 4GL,.E, — LGE,G./E)/4, 
A; = (12LMNG — 2LGG,, + 4MGE,, — 3ME,G, — 2GN.,E, + LG.G, 

— 2NE.G.)/4, 
A, = (4M°NG — MGE,G,/E — 2MGG,, + 2MG.G, — LGG?/E + 2GN.G,)/4. 
Now suppose C) is a curve such that A)0. Since by definition the rulings of 
the quadric Q, are real, we find that the osculating quadric Q,, at every point of 
a curve Cy, on an orthogonal net is a hyperbolic paraboloid if, and only if, Cy 
belongs to the family of hypergeodesics defined by D’ =0. Otherwise Q, is a hyper- 
boloid. 


LamsButTH COLLEGE, 
JAcKSON, TENN. 


531 
| 

| 

| 


ON RIESZ AND CESARO METHODS OF SUMMABILITY* 


BY 
RALPH PALMER AGNEWt 


1. Introduction. Marcel! Riesz} formulated the following method of sum- 
mability: Let r be any complex constant and, given a series uo-+-u1+4%e+ -- -, 
let 


n—l k r 
(1.1) A;: a, = > (1-=)m (n = 1, 2,3,---); 
n 


if lim,.. @n=L, then >~u, is said to be summable 4, to L. 
In a second note, Riesz§ gave a method which is, when r>0, equivalent|| 
(vide Theorem 4.4) to the following: Let r be a complex constant and letf 


{t]—1 k r 

(1.2) B,: A(t) = (1-=)m, 
k=0 

if B(¢) approaches a limit LZ as ¢ becomes infinite over the real set 21, then 

>>u, is summable B, to L. The second method of Riesz is the following: 

Let R(r) >0, and let 


k\r 
(1.3) D,: = (1-=) ms, 
k=0 

if 5(¢) converges to L as ¢ becomes infinite continuously, then }-~, is sum- 
mable D, to L. The method D, is known as the Riesz method of order r and 
type A, =”**, and has proved to be one of the most useful of all methods of 
summability. 

In his second note, Riesz outlined a proof that D, is equivalent to C,, 


* Presented to the Society, December 30, 1931; received by the editors August 24, 1932. 

t National Research Fellow. 

t Comptes Rendus, vol. 149 (1909), pp. 18-22. In this note Riesz considered only real positive 
orders r. 

§ Comptes Rendus, vol. 152 (1911), pp. 1651-1654. Here again Riesz considered only the case 
r>0. 

|| The terminology used in this paper is that given by W. A. Hurwitz, Bulletin of the American 
Mathematical Society, vol. 28 (1922), pp. 17-36. 

4 We use the symbols [t] and [¢-] to denote respectively the greatest integer S$ and the greatest 
integer <?. 

** Hardy-Riesz, General Theory of Dirichlet’s series, Cambridge Tracts in Mathematics and 
Mathematical Physics, No. 18. 


532 


RIESZ AND CESARO SUMMABILITY 533 


the Cesdro method of order r, when r >0. Chapman* has stated that Riesz’s 

proof of equivalence of D, and C, holds when r> —1; but this statement is 

incorrect as Theorem 2.1 shows. Hobson has given a more detailed proof of 
equivalence of D, and C, when r>0. 

In a third note, Rieszf outlined a proof that A, and C, are equivalent when 
—1<r<1, and showed that this equivalence does not hold for certain values 
of r>1. 

It is the object of this paper to discuss A,, B,, D,, C,, and closely related 
methods of summability. We shall be especially interested in orders r with 
real part R(r) <0. 

In §2 we show that D, does not constitute a useful method of sum- 
mability when R(r) <0; and in §§2-3 we discuss modifications of D, which 
may be expected to be useful when R(r)<0. For each complex r, these 
modifications are found to be equivalent to B,. In §4 we show that B, and 
D, are equivalent when r 20. In §§5-7 we obtain auxilliary results from which 
it follows in §8 that B, and C, are equivalent when —1<R(r) <0. The theo- 
rems of §8 give a complete solution of the problem which furnished the point 
of departure of this investigation. In §9 we give relations between methods 
B, of different orders. We show in §§10-11 that A, does not possess certain 
properties of B, when R(r) < —1; in particular when R(r) is less than a cer- 
tain constant between —2 and —1, A, is not consistent with convergence. 
Finally, in §12 we point out that when R(r) <0, the methods A,, B,, and C, 
are equivalent over a certain class of series. 

2. Ineffectiveness of D, when R(r) <0. It is well known that D, is regular 
when 7 is real and 20. It can be shown further that D, is regular when 
R(r) >0; and that D, is not regular when R(r) =0 but r+0. To show that D, 
does not constitute a useful method of summability when R(r) <0, we will 
prove the following Theorem. 


THEOREM 2.1. In order that >\u», may be summable D, when R(r) <0, it és 
necessary and sufficient that > un have at most a finite number of terms different 
from zero. 


Sufficiency is easily established. To prove necessity, let us suppose that 
u,~0 for a certain index p>0; then =-++. Hence if u,~0 
for an infinite set of values of &, then 5(¢) is unbounded over every interval 


* Proceedings of the London Mathematical Society, (2), vol.9 (1910-11), p.374, second foot- 
note. 

T Theory of Functions of a Real Variable. vol. II, 1926, pp. 90-98. 

t Proceedings of the London Mathematical Society, (2), vol. 22 (1923-24), p. 418. 


i 
| 
i 


534 R. P. AGNEW [April 


(N, ©) so that 6(#) cannot converge as i> and the theorem is proved.* 
Theorem 2.1 and its proof make it clear that if a useful generalization to 
orders with real part R(r) <0 of the Riesz method D, is to be obtained, the 
upper index of summation with respect to k must be a function of ¢ which is 
definitely less than [¢~]. The two methods defined by the transformations 


{t—@] k r 
(2.2) r(t) = >> 
k=O t 
k\t 
(2.3) i= > (1-=) um, 0<t<o, 
k= t 


where @ is a positive constant, suggest themselves at once as modifications of 
D, which may be useful for every complex order. 

Let r be any complex number. Then, corresponding to any given series 
dun, the functions 7(#) and p(#) are equal except when ¢ is of the form 
t=n+6 and u,+0, in which case r(n+0)~p(n+6). Furthermore the trans- 
forms and p(#) are continuous except when and u,+0; here 
and p(#) have finite jumps, 7(#) having right-hand continuity and p(¢) having 
left-hand continuity. It follows that if either 7(#) or p(¢) converges as i>, 
then the other must also converge to the same value as 0. Hence the 
methods (2.2) and (2:3) are equivalent. We elect to consider the first rather 
than the second of these. 

3. Consideration of (2.2) for different values of 6.In this section we will 
establish a theorem which will be of fundamental importance in the sequel; 
and will show that, for any fixed complex 7, the methods (2.2) obtained by 
selecting different positive values of @ are equivalent to B,. 


THEOREM 3.1. If }-un is summable (2.2) with r a fixed complex constant and 
6 a fixed positive constant, then 


(3.11) lim u,/n" = 0. 


Suppose >>, is summable (2.2) to L. Then 


[t—@] k 
lim > (1 0 
ke t 

* There is an apparent inconsistency between Theorem 2.1 and Chapman’s statement (loc. cit., 
p. 401) that the Dirichlet series 1-*—2-*43-*—4-*+- --- , s>0, is summable (R, n, —r), i.e. D_,, 
when r<s. The last equation of p. 399 shows that Chapman has used the transformation B, rather 
than D,; and furthermore the second equation of p. 400 is correct only when [n]=n. Therefore, as a 
matter of fact, Chapman has not shown that >. (—1)" (n+1)~* is summable D_, when r<s; what he 
has shown is that the series is summable A_, when r<s. However, it follows from this result and 
Theorem 12.1 that the series }-(— 1)" (n+1)-* is summable B_, and C_, as well as A_, when r<s. 


no 


1933] RIESZ AND CESARO SUMMABILITY 535 


where and v,=u, when >0. Given e>0, choose 7 such that 


[t—6] 
(3.12) > ( : t>T. 


k=0 


Let n»>T +1, let 0<h<1, and set =n+6—h in (3.12) to obtain 


(3.13) (1 ities O<h<1 


k=0 


The left member of (3.13) is a continuous function of 4 over the closed 
interval 0 <4 <1; hence we may take the limit as 4-0 to obtain 


(3.14) (1 


k=0 


€ 
%| S— a>T+1. 
Again we may set ¢=”+60 in (3.12) and write the last term of the sum as 
a separate term to obtain 


(3.15) (1 +( >T+1 
Combining (3.14) and (3.15), we find that | 6'0,/(n+6)"| <e when n»>T+1. 
Hence lim,.., =0 so that lim,.., =0 and, since v, =u“, when 
n>0, (3.11) follows. Thus Theorem 3.1 is proved. 

A slight modification of the preceding argument shows that if }>u, is 
bounded (2.2), then is bounded for all  >0. 


THEOREM 3.2. If r is a complex constant and 


k=0 


represent two different methods of the form (2.2) with 0>0, 0’>0, and if 
furthermore lim,... Un/n’ =0, then 


(3.21) lim — = 0. 


To establish this result, we may assume that @>6’ and show that the 
difference in the left member of (3.21) consists of a finite number of terms 
each of which approaches zero as [> 

From the two preceding theorems we obtain at once 


THEOREM 3.3. When r is any complex constant, the methods obtained by 
assigning different positive values to 0 in (2.2) are equivalent. 


| 

4 

| 
{ 

} 

{ 

| 

| 

| 


536 R. P. AGNEW [April 


For if >>u, is summable (2.2) for a positive value of 6, then lim u,/n’ =0 
by Theorem 3.1; hence the hypotheses of Theorem 3.2 are satisfied and the 
conclusion (3.21) completes the proof of Theorem 3.3. 

The only representative of the set of methods (2.2) which we will con- 
sider in the sequel is that for which 0 = 1; in this case (2.2) becomes B,. 

4. Relations between B, and D, when R(r) 20. Before passing to a study 
of B, when R(r) <0, we wish to point out that B, is closely related to the 
familiar Riesz method D, when R(r) =0. 


THEOREM 4.1. Jf R(r) 20 and lim u,/n* =0, then 
(4.11) lim {6(#) — = 0. 


We find from (1.2) and (1.3) that |6(¢) —8(é)| <| u4/[¢]"| when R(r) 20 
and ¢>1, and Theorem 4.1 follows. 

THEOREM 4.2. If R(r) =0, then D, includes B,. 

If >-u, is summable B, to L so that lim @(t)=Z, then lim u,/n"=0 by 


Theorem 3.1 with 6=1; hence the hypotheses of Theorem 4.1 are satisfied, 
the conclusion (4.11) shows that lim 6(#)=Z, and Theorem 4.2 is proved. 

THEOREM 4.3. If r2=0, then B, includes D,. 

The proposition being evident when r=0, we suppose r>0. Let dun be 
summable D, to L so that lim 6(#)=Z. Then, using the fact (§1) that C, 
includes D, when r>0, we see that >>, must be summable C, and hence, as 
is well known, that lim u,/n* =0. Hence Theorem 4.1 shows that lim 8(¢#) =Z 
and Theorem 4.3 is proved. 

Combining Theorems 4.2 and 4.3 with the fact that D, and C, are equiva- 
lent when r20, we obtain 


THEOREM 4.4. If r=0, then B,, D,, and C, are equivalent. 


5. A relation between the A, and B, transforms when R(r) <0. We pro- 
ceed to establish some preliminary propositions, interesting in themselves, 
which will enable us to obtain relations between B, and C,. 


THEOREM 5.1. When R(r) <0, the assumption that 


{-—1 k r 
(5.11) lim = lim (:-=) =L 
t 


is equivalent to the two assumptions 
(5.12) lim u,/n’ = 


no 


tor wo 
0 


RIESZ AND CESARO SUMMABILITY 


and 
n—1 
(5.13) lme,= lim >> (1 


That (5.11) implies (5.12) follows from Theorem 3.1 with 6=1; and that 
(5.11) implies (5.13) follows from the fact that a, =8(m). Hence our problem 
here is to show that (5.12) and (5.13) together imply (5.11). 
A consideration of the sequence defined by and 
n >0, shows that it is sufficient to prove (5.12) and (5.13) imply (5.11) when 
L=0. We suppose therefore that R(r) <0, that (5.12) holds, and that (5.13) 
holds with ZL =0; we will show that (5.11) holds with Z=0. 
Given e>0, choose an index NV >0 so great that 


n—1 k 
(5.14) (1 n=N, 


k=0 


and 
(5.15) | r| ), n=N, 


where and r’=R(r). Next, choose an index P>WN so 
great that 


N-1 k 
(5.16) > #(1- P, 


k=0 


Let »>FP and consider the function 


k=0 
Using (5.14), we see that 
(5.18) | B(n)| < 6/2. 
Differentiating (5.17) we find 


k=0 


where the derivative for =m is a right-hand derivative. Hence 


and using (5.16) and (5.15) we obtain 


* Consideration of independence of (5.12) and (5.13) is relegated to §10 where we study A,. 


1933] 537 

| 

u, = L.* 

| 

r—1 

1 


538 R. P. AGNEW 


(5.19) | B'(t) | < «/4 + 
where 
n—1 kr’ t— k r’—1 
(5.20) ame 
k=l 
But since r’ <0, 
{t/2] Rr’ t-— k r’—1 1 {¢/2] k r/—1 1 {t/2] 
kel iv t 


and 


t 


p=1 


km (t/2]+1 k=[t/2]41 


so that 
(5.21) ®,(t) + = B, nst<n+1. 
p=1 


From (5.19) and (5.21) we obtain 
(5.22) | p’(t)| <«/2, nst<n+t+l. 
Using (5.18), (5.22), and the formula 


<|acm| + f | | dt, 


we find that | B(t)| <e«,m<t<n+1. 

We have shown that if »>P, then |6(#)| <«, n<i<n+1. It follows that 
if ¢>P+1, then | 8(t)| <«. Hence lim @(¢) =0 and Theorem 5.1 is proved. 

6. Lemmas involving C,. The Cesaro method C, of order r (r not a nega- 
tive integer) is defined by the transformation 


(6.01) Cet = 
k=0 
where 
+ (nm — k+1+7) 
Tn+1+nr(n—k+1) 


(6.02) Onk 


The following two lemmas will be used in the next section. 


[April 
nst<n+1. 
2 
(n = 0,1, 2,---), 
Osksn. 


1933] RIESZ AND CESARO SUMMABILITY 539 


Lemma 6.1. Corresponding to each complex constant r (not a negative integer) 
there is a bounded sequence Ci. of constants such that for each positive index n 
and each index k<n 


(6.11) On = (1 ~)( 


Using the familiar asymptotic expansion of the logarithm of the gamma 
function of a complex argument,* we find 


(6.12) log +1+7r/T(n+ 1)} = rlogn + H,/n, n> 0, 


where H,, is a bounded sequence of constants. Subtracting (6.12) from the 
relation obtained by replacing m by n—k in it, we obtain 


(6.13) log dnx = 7 log {(n ~ k)/n} — H,/n + Hy-i/(n — k) 
when 2 >0 and k<n. The lemma results from (6.13). The following lemma is 
easily deduced from (6.12). 
LemMaA 6.2. When r is not a negative integer 
(6.21) lim dn, = T(1 +71). 


In §8 we shall need 
Lemma 6.3. When R(r) < —1, 7 not a negative integer, the condition lim 


un/n' =0 is not necessary in order that >u, may be summable C,. 


The inverse of C, is, when 7 is not an integer, given by 
k+r\/r+i1 
(6.31) m= 
k=0 r n—k 
or 
sinar T(2 +7) T(kR+1+ 7) T(n—k-1-7) 


(6.32) u, = x T(k+1) T(i + r) T(n — k + 1) 


Corresponding to each complex r which is not an integer, let the sequence 
vx” be defined by yo =x/{sin rrI'(2+r)} and =0 when and 
let >-un” be the series whose C, transform is 7,‘”. Substituting in (6.32) we 
find 

(6.33) us = T(n — 1 —1)/T(n +1) = n-*-"(1 + 0(1)) 


so that 


* See, for example, J. L. W. V. Jensen (translation by T. H. Gronwall), Annals of Mathematics, 
(2), vol. 17 (1916), p. 136. 


Crk 

1 +——}. 


540 R. P. AGNEW 


(6.34) /nt = + 0(1)). 

The series >>u, is summable C, to 0 and, when R(r) < —1, the right 
member of (6.34) fails to converge to 0 as n—~; thus Lemma 6.3 is es- 
tablished. 

7. A relation between the A, and C, transforms when R(r) <0. With the 
lemmas of §6 at our disposal, we are in a position to prove the following theo- 
rem. 

THEOREM 7.1. If R(r) <O (r not a negative integer) and the terms of > un 
satisfy the condition 


(7.11) lim u,/n" = 0, 


then 
(7.12) lim (Yn — @n) = 0, 


where y, and a», represent respectively the C, and A, transform of > Un.* 


Letting >-~, be any series for which (7.11) holds, we have for each n>1 


(7.13) Yn An = InnUn + {on (1 
n 


k=0 


Writing dané, in the form (”’dnn)(un/n’), we see from Lemma 6.2 and (7.11) 
that it approaches zero as m becomes infinite. Furthermore the coefficient 
of uo is zero for each n. Hence it follows from (7.13) that 


and we may use Lemma 6.1 to obtain 
(7.15) 


Choosing a constant C such that |C,.| <C when 0<k <n, we obtain 


n—1 kr’ k r’—1 


k=1 


| | ’ 


where r’ =R(r). 
* It should be noted that the hypotheses of this theorem are not sufficient to ensure that either 


of the sequences Yn or am is convergent, and hence that this theorem gives an especially important 
relation between Cesaro and Riesz transforms. 


[April 
-—) 
k=l n n—k 


1933] RIESZ AND CESARO SUMMABILITY 541 


Now (7.16) shows that (7.12) will follow if lim v,=0 implies lim V,=0 
when V, is defined by 
n—1 kr’ n — k)’-1 

Thus we can establish Theorem 7.1 by proving that the transformation 
defined by (7.17) is regular over the set of sequences which converge to zero. 
To prove the latter result, it is necessary as well as sufficient* to prove that 
(7.18) lim — = 0 (k = 1,2,3,---), 


and that 


n—1 
(7.19) <M (n= 2,3,4.---), 
k=1 
for some constant M which may depend on r but must be independent of n. 
It is clear that (7.18) holds for any value of r. That (7.19) holds when 
R(r) <0 follows from (5.20) and (5.21) since W,, = ®,(m). Thus Theorem 7.1 
is proved. 
8. Relations between B, and C,. The preceding results enable us to es- 
tablish the following two theorems. 


THEOREM 8.1. If R(r) <0, r not a negative integer, then C, includes B,. 


Suppose >“, is summable B, to L so that lim 6(#) =. Then by Theorem 
5.1, (5.12) and (5.13) hold and we may use Theorem 7.1 to show that lim 
y.=L. Thus Theorem 8.1 is proved. 


THEOREM 8.2. If —1<R(r) <0, then B, includes C,; if R(r) < —1, B, does 
not include C,. 


Suppose —1<R(r) <0 and }-u, is summable C, to L. Then lim y,=L. 
Since, as is well known, (7.11) is a necessary condition for summability C, 
when R(r) > —1, we can apply Theorem 7.1 to obtain lim a, =; an applica- 
tion of Theorem 5.1 completes the proof of the first part of Theorem 8.2. To 
prove the second part suppose R(r) < —1, and, of course, that r is not a 
negative integer. By Lemma 6.3, there is a series >. u, summable C, for which 
(5.12) fails; hence by Theorem (5.1), >>“, is not summable B, and the second 
part of Theorem 8.2 is proved. 

Theorems 8.1 and 8.2 yield 


* Kojima, Téhoku Mathematical Journal, vol. 12 (1917), pp. 291-326; p. 300. 


542 R. P. AGNEW [April 


THEOREM 8.3. If —1<R(r) <0, then B, and C, are equivalent; if R(r) < —1, 
B, and C, are not equivalent. 


THEOREM 8.4. If ris real and > —1, then B, and C, are equivalent. 


When —1<r<0O, this is included in Theorem 8.3. When r>0, the result 
is included in Theorem 4.4. 

Cesaro’s method C, of summability is, as is well known, not regular when 
R(r) <0. When —1<R(r) <0, C, will evaluate only a subset of the set of all 
convergent series, and will evaluate no divergent series; hence, as might be 
expected, C, occupies, for this range of values of r, a prominent place in the 
theory of series. On the other hand when 7 is real and < —1, C, can evaluate 
to zero certain divergent series of positive terms (see, for example, §6). 
Owing to this fact, and also to the fact that many useful properties which hold 
when R(r) > —1 fail when R(r) S —1, the method C, has received little atten- 
tion when R(r) < —1. 

It is of interest to note that Theorems 8.1 and 8.2 show that B, is equiva- 
lent to C, over precisely the range of values of r with negative real parts over 
which C, has been useful, namely the range —1<R(r) <0. 

In the next section, we will show that summability B, is significant even 
when R(r) < —1. 

9. Relations between methods B, of different orders. In this section we 
prove six theorems on relations between methods B, of different orders. 


THEOREM 9.1. If R(r) < —1 and R(r) SR(s), then B, includes B,. 


Let be summable B, to L so that lim (#) =L. Then, by Theorem 
5.1, lim u,/n* =0. We may write 


We see that Theorem 9.1 will follow if the transformation 


is regular over the set of all sequences w, which converge to zero. 
Letting d,(¢) represent the coefficient of w; in (9.11), we have evidently 


(9.12) lim d,(t) = 0 (k = 1, 2,3,---). 


RIESZ AND CESARO SUMMABILITY 


{t—1] r’ 8 {t]-1 r’ 
= 1 k=1 k=1 


k 


where r’=R(r) and s’=R(s). Since r’ < —1, 


[t/2} [¢/2] 
k=l t 


[t]—1 [t]—1 

[t/2]+1 t k= [t/2]+1 kel 
Hence 


]-1 


(9.13) d,(t)| < 2-"'+2 


k=1 


The conditions (9.12) and (9.13) ensure that (9.11) has the desired property 
and Theorem 9.1 is proved. 
~ From Theorem 9.1, we obtain at once 


THEOREM 9.2. If R(r) =R(s) < —1, then B, and B, are equivalent. 


From Theorems 9.1 and 5.1 we obtain 


THEOREM 9.3. If R(r)<—1 and Dou, is summable B, to L, then > oun 
converges to L, the convergence being absolute. 


That >>“, must converge to L follows from the fact that Bo, which in- 
cludes B, by Theorem 9.1, represents convergence. Again, by Theorem 5.1, 
lim u,/n* =0; hence | un| <n*’, r’ =R(r) < —1, for all sufficiently great , and 
absolute convergence of >>“, follows. Thus Theorem 9.3 is proved. 


THEOREM 9.4. If —1<R(r) <R(s) <0, then B, includes B,. If —1<r<s, 
then B, includes B,. 


The first part of the Theorem follows from the fact (Theorem 8.3) that 
B, and C, are equivalent when —1<R(r) <0 and the fact that C, includes 
C, when —1<R(r) <R(s). The second part follows from a similar application 
of Theorem 8.4. 

To complete Theorems 9.1 and 9.4, it would be desirable to determine 
whether B, includes B, when —1=R(r) <R(s) <0. Neither the method of 
proof of Theorem 9.1 nor that of Theorem 9.4 throws light on this question. 
A partial answer to this question is given by the following theorem. 


1933] 543 
Also 
and 


544 R. P. AGNEW [April 


THEOREM 9.5. If —1=r<R(s) <0, then B, includes B,. 


We shall give a proof of Theorem 9.5 after having proved Theorem 10.1 
below. After having proved Theorem 9.5, we can use Theorems 9.1, 9.4, 
and 9.5 to give a relation of inclusion between any two methods B, of real 
orders, namely 


THEOREM 9.6. If r<s, then B, includes B,. 

10. Consideration of A,. Since it is sometimes convenient to use trans- 
formations involving a continuous parameter, and at other times a discontin- 
uous parameter, it is important to know whether A, and B, are equivalent, 
and whether the results which we have established for B, hold also for A,. 

Using Theorem 8.4 and the result of Riesz that A, and C, are equivalent 
when —1<r<1, we see that A, and B, are equivalent when —1<r<1. We 
proceed to show that A, and B, have very different properties when R(r) 
<-1. 

A series >; is said to be summable by the Abel method P to L if }°u,«* 
converges for |x| <1 and lim,.:- }>u,«*=L. We shall say that }°u, is sum- 
mable P* to L if >>u,x* converges for all sufficiently small |x| and generates 
an analytic function u(x) such that lim,.:— u(x) =Z.f It is evident that P* 
includes P and that P does not include P*. 


THEOREM 10.1. If R(r)S—1 and r~—1, then P* does not include A,; 
if r is real and = —1, then P* includes A,. 


Let >>u, be summable A, to L; then a,—L where 
(10.11) (nm + = Di(m+1— 
k=0 
From (10.11) we obtain when |x| <1, 


(10.12) t+ = +1— 


n=0 k=O 


Letting u(x) be the analytic function determined by the equation 


(10. 13) + = u(x) + 
n=0 


n=0 


we see that when |:| is sufficiently small, say |x| <6, u(x) is a convergent 
power series in x. A comparison of (10.12) and (10.13) suffices to show that 


¢ This modification of Abel’s method was introduced by Silverman-Tamarkin, Mathematische 
Zeitschrift, vol. 29 (1928), pp. 161-170; p. 169. 


1933] RIESZ AND CESARO SUMMABILITY 545 


(10.14) u(x) = | < 5; 
k=0 

hence u(x) is the analytic function generated by >\u,x*. That P* includes A, 

when r is real and = —1 follows at once from the conditions for regularity T 

of the transformation defined by (10.13); this also follows from a result of 

Silverman-Tamarkin. loc.cit. 

We shall prove the first part of Theorem 10.1 by a method which shows 
that P* and A, are inconsistent when R(r)<—1. Corresponding to each 
complex r, let }>u% be the series having for its A, transform the sequence 
a,=1, a, =0 when Then is summable A, to 0. Using (10.14) and 
(10.13), we see that the analytic function u(x) generated by )ouf{x* is 
given by 


(10.15) u(x) + = 1. 
n=0 


But when r = —1—ih, h real and +0, we have as x—1-{ 


+ {log (1/x) } * 
n=0 


so that lim,..- u(x) does not exist and >>” is non-summable P*. On the 
other hand, if R(r) <—1, then }>(m+1)* converges to ¢(—r) which is finite 
and different from zero; hence >-u" is summable P* to 1/¢(—r) which is 
finite and different from the A, value of >-u”. Thus Theorem 10.1 is proved. 

We pass now to a proof of Theorem 9.5. Let >>u, be summable B_, to L. 
Then by Theorem 5.1, mu,—0 and >>, is summable A_; to L. Then by 
Theorem 10.1, >>, is summable P* to L. But }>u,x* must converge when 
|x| <1 since mu,—0; hence >‘, is summable P to L. Therefore, by Tauber’s 
Theorem§ >>, must converge to L. Since nu,—0 and )-u, converges to L, 
it follows|| that >°w, is summable C, for every s with R(s)>—1. Finally 
summability B, for every s with —1<R(s) <0 follows from Theorem 8.3 and 
Theorem 9.5 is proved. 

We have shown in the proof of Theorem 10.1 that when R(r) < —1, the 
transformation A, can evaluate to 0 a series which is not summable P to 0 and 
which is therefore not convergent to 0. Using this result and Theorem 9.3, we 
obtain 


t W. A. Hurwitz, loc. cit., p. 20. 

t Lindeléf, Le Calcul des Résidus, p. 139. 

§ A. Tauber, Monatshefte fiir Mathematik und Physik, vol. 8 (1897), pp. 273-277. 

|| Hardy and Littlewood, Proceedings of the London Mathematical Society, (2), vol. 11 (1912), 
p. 462. 


00 


546 R. P. AGNEW [April 


THEOREM 10.2. If R(r) < —1, then A, and B, are not equivalent. 


Theorem 10.1 also shows that the methods A, do not, in contrast to the 
methods B,, form for real values of r a set of consistent methods of summa- 
bility whose effectiveness increases steadily as r increases. 

We can now see that (5.12) is not a consequence of (5.13) when R(r) < —1 
by proving 

THEOREM 10.3. When R(r) < —1, the condition u,/n*—0 is not necessary in 
order that >-u, may be summable A,. 

If the condition were necessary, it would follow from Theorem 5.1 that 
A, and B, would be equivalent and Theorem 10.2 would be contradicted. 

In the next section, we give a theorem which is interesting in connection 
with Theorem 10.3, and give further properties of A,. 

11. Consideration of A, when R(r)<f¢. Let ¢, —2<¢<-—1, be the real 
negative root of the equation. 


(11.01) 
We shall now prove 


THEOREM 11.1. If R(r)<¢ and D-up is bounded A,, then u,/n' is bounded 
for alin 


Let >-u, be bounded A,, R(r) <f, so that a», being defined by 


(11.11) a, = (1 
n 


k=0 


is a bounded sequence. Since r’=R(r) <f, it follows that 
(11.12) 
Choose an index p>1 so great that 
(11.13) Ske’ < (1 — 0,)/2 

k=p 


and let a sequence v, be defined by the formulas v,=0, +m 
+--+ -+u,; and 7,=u,,n>p. Then 


n—1 k r n—1 k r 
a, = 0(1)+ >> (1 =) = o(1) + ~) (u./k*). 
n n 


k=p k=p 


Hence we can prove Theorem 10.1 by showing that boundedness of W, 


t >. un is said to be bounded A, when its A, transform is a bounded sequence. 


1933] RIESZ AND CESARO SUMMABILITY 
implies boundedness of w, whenever 
(11.14) y 
= Wk, 
n+1) 


Let dn, represent the coefficient of w, in (11.14). Then when n >2p, 


n—p—1 
(11.15) | < (1 — 8,)/2; 
k= p k=p 


the first inequality being obtained by considering separately the sums when 
k ranges from p to [(m+1)/2] and from [(n+1)/2]+1 ton—p-—1. Also 


n—1 P 
(11.16) lim > = 


k=n—p k=2 


Combining (11.15) and (11.16), we obtain 


n—1 
(11.17) limsup >>| S (1+ 6,)/2 <1. 


no k=p 


Since 
(11.18) lim dain = 1, 


we may use (11.17) and the fact that d,,,=0 when k <p to obtain 
n—1 
(11.19) lim inf | >0. 
k=0 


Owing to (11.19), the fact that the transformation (11.14) has the desired 
property results from the following lemma. 


Lemma 11.2. If the coefficients in the transformation 
(11.21) Wr= Didarwe 
k=0 
satisfy (11.19) and if W, is a bounded sequence, then w, is a bounded sequence. 


To prove this lemma, let w, be an unbounded sequence; we shall show 
that W, is an unbounded sequence. Since w, is unbounded, we can choose 
an increasing sequence ; of indices such that |w,,| 2|w:| when 0<k<nj. 
Then 


nj—-1 
| Was] — | we | + | | 
k=0 


nj—1 
k=0 


547 
n> 


548 R. P. AGNEW 


But lim |w,,| =-++2 and using (10.19) we see that lim | W,,| =-+o. Hence 
W, is an unbounded sequence, Lemma 11.2 is proved, and Theorem 11.1 
follows. 


THEOREM 11.2. If R(r) <f, every series summable A, is convergent, but not 
necessarily to the value to which it is summable. 


That a series summable A, must be convergent follows from Theorem 
11.1; in fact boundedness A, is sufficient to ensure absolute convergence of 
>- un. That the A, and convergence values need not be equal is shown by the 
series >. used in the proof of Theorem 10.1. 

Since C, can evaluate certain divergent series when R(r)<—1, r not a 
negative integer, and A, can evaluate only absolutely convergent series when 
R(r) <¢, it follows that C, and A, are not equivalent when R(r) <f. 

The methods A,, R(r) <{, may be of use for classification of convergent 
series; but use of such methods for evaluation of series is open to the objec- 
tion that they are, by Theorem 11.2, inconsistent with convergence. 

12. Conclusion. In conclusion we point out that while A,, B,, and C, are 
not mutually equivalent when R(r)<—1, there is an important class of 
series over which these methods are equivalent. In fact, a combination of 
Theorems 5.1 and 7.1 yields the following theorem. 


THEOREM 12.1. Jf R(r) <0, being # —1, —2, - - - when C, is involved, and 
lim u,/n" =0, and if >-u, is summable by one of the methods A,, B,, and C,, then 
it is summable to the same value by the other two methods. 


Brown UNIVERSITY, 
PROVIDENCE, R. I. 


= 


ON SOME FUNCTIONALS* 


BY 
STANISLAW SAKSt 


1. In this paper we intend to give a new proof and various generalizations 
of the following theorem due to Hahn:{ 

Tf {fn(t)} is @ sequence of summable functions in the interval J =(0, 1) and 
if lim, Jz fn(t)dt exists for every measurable set E c J, then the indefinite integrals 
F(x) =J¢ fa(é)dt are equally absolutely continuous in J and therefore converge 
to an absolutely continuous function. 

The proof will be based on a theorem of Baire which has proved useful in 
many similar cases.§ Incidentally there will be given a generalization of 
another theorem concerning sequences of functional transformations and 
published in a previous paper by the author.! 

2. We shall denote by R the space of measurable characteristic functions 
in the interval J = (0, 1), i.e., functions which almost everywhere assume two 
values only, 0 and 1. The distance of two functions x(é), y(¢) in R is defined 
by the formula 


d(x, y) = =f y(t)| dt, d(x, 0) = = f 


With this definition of the distance, R is a metric complete space. It is not 
linear but nevertheless has simple properties which in some cases may replace 
linearity. We shall state them in the following lemma. 

Lemma. (i) If and xo belong to R and if |\u\| <r, then there\exist lin’R 
elements U1, U2 such that 


Uy, = Ue + u, d(xo, S 17, d(xo, Sr. 
(ii) If x1¢ R, R, d(x, x2) Sr, then there exist in R us such; that 
<r, S 7, — © R, = — te. 
(iii) If x is an element of R and r>0, then there exist a finite number of ele- 
ments Ux, U2, - - - , un Such that ||\u;|| <r (i=1, 2, - - - , m) and 


* Presented to the Society, October 29, 1932; received by the editors August 24, 1932. 

¢ International Research Fellow. 

t Hahn [1]. Another proof is given in the book of Banach [2, pp. 152-158]. References in brack- 
ets refer at the bibliography at the end of this paper. 

§ See for instance Banach and Steinhaus [1], Banach [2]. 

q Saks [1]. 


549 


550 STANISLAW SAKS 


To prove (i) we simply put 
= u(t) + xo(t)[1 — u(t)], 
uo(t) = xo(t)[1 — u(¢)]. 

To prove (ii) we put 

= [1 — x2(¢)], 
uo(t) = x2(t) [1 

To prove (iii) let m be an arbitrary integer >r-'. Let 0; (¢=1, 2, - - - , m) 
denote the characteristic function of the interval ((i—1)/n, i/n). The func- 
tions u,(t) =x(t)v,(t) satisfy the condition required by (iii). 

The theorem of Hahn will now be stated as follows (in a slightly more gen- 
eral form): 


TueoreM 1. Jf {f,(t)} is a@ sequence of integrable functions and if the se- 
quence of functionals 


F,(x) = f © R, 


defined in the space R, converges on a set of the second category* H ¢ R, then the 
unctionals F,,(x) are equally continuous} in R. 


We can suppose that the sequence of the functionals F,(x) converges to 
zero for every x ¢ H; otherwise we could replace the sequence {F,(x)} by the 
double sequence {F,(x)—Fn(x)}. Let € be an arbitrary positive number. 
Denote by H, the set of points x ¢ R such that | F,,(x)| < ¢/2 for every m2n. 
Then 


He 


As H is by assumption of the second category there exists a value m» such that 
H,,, is also of the second category. On the other hand, by the continuity of the 
functionals F,,(x), all sets H, are closed; hence H,, contains a sphere, say{ 
K =K (xo; r). Let u be an arbitrary element of R such that ||x|| <7. Then, by 


* In the sense of Baire. See for instance Hausdorff, Mengenlehre, 1927, pp. 138-145. 

Le., to every there corresponds an 7>0 such that Sn, x CR, implies |Fa(x) | Se 
(n=1, 2,- ++). From the equal continuity of the functionals F,(x), it readily follows that under 
the assumptions of Theorem 1 the sequence { Fn(x) } converges everywhere in R. For a more general 
result see below, Theorem 4 (i). 

t In metric spaces, K(x; r) will generally denote the sphere whose center is xo and radius is r. 


{April 
n 


1933] ON SOME FUNCTIONALS 551 


the preceding lemma, (i), there exist elements 1, #2 in R such that u;=ue+u, 
u; ¢ K, K. Therefore for every n=mo, 


| Fa(u)| S| +| S 


Thus the theorem of Hahn is established. 
By the same method the following theorem may be proved: 


THeoreM 2. If {f,(t)} is a sequence of functions integrable over ihe interval 
J =(0, 1) and if for all functions x(t) of a set of the second category in the space R 


then also 


Tim 


3. In the sequel we shall consider functional transformations defined 
either in an arbitrary metric complete linear space or in the space R of charac- 
teristic functions (§2). The values assumed by these transformations will 
belong to the space S of all measurable functions defined on a measurable 
set J. The distance d(é, 7) of two functions &(¢), n(¢) ¢ S will be defined by the 
well known formula of Fréchet 


| — n(2)| 
d = dt; 


and we put, as usual, ||¢||=d(¢, 0). With this definition of the distance, 
lim, d(én, £) =0 (En, £¢.S) means that the sequence {£,(#)} converges in meas- 
ure to &(é). 

Let 


(3.1) & = &(x, = F(x) 

be a functional transformation of the kind described above (£ is a measurable 
function, x an element of a metric space* E, ¢¢ J). Since E and S are metric 
spaces it is clear what should be understood by the continuity of the trans- 
formation (3.1). This transformation will be called linear if it is continuous 
and if, for every pair of elements x, yin E£, 


F(x + y) = F(x) + F(y) 
provided that the sum x+y is defined and belongs to E.f 


* For the sake of convenience we shall use the Greek letters ¢, 7, « - * to denote the elements of 
the space S, i.e., the measurable functions, and italics, x, y, - - - , for the elements of the space where 
the transformations (3.1) are defined; ¢ will usually denote points of the set J. 

t This restriction is necessary in the case of the space R which is not linear. 


552 STANISLAW SAKS 


Functional transformations 
(3.2) fn = &n(x, t) = Fy(x) 
(§. ¢S, x ¢ E, tcT) will be called equally continuous in the metric space E 


if to every «>O there corresponds an 7>0 such that whenever x’, x’’cE 
and d(x’, x’’) <1, we have 
— Se, m= 1,2,---, where = F,(2’), = Fa(x”); 
or, what is equivalent, if to every e>0 there corresponds an 7>0 such that 
d(x’, x’”) <n implies |£,(x’’, t) —£,(x’, t)| Se for all ¢¢ J, with the exception 
at most of a subset* of J of measure less than e. 
If A is a measurable subset of J then the number 


| — n@| 

= f dt = lle al 

will be called the distance of £ and 7 with respect to the set A. With this defi- 

nition it is clear what is meant by the continuity of a transformation (3.1), or 

the equal continuity and the convergence of a sequence of transformations 

(3.2), with respect to a measurable set A c/. 


4, Lemma. If 
(4.1) = £,(x, t) F,(x) (n 1, 2, ) 


(xc E, tcl, &,¢S) is a sequence of continuous functional transformations in a 
metric complete space E, and if for every element x of a set of the second category 
H cE, the inequality 


lim | En(x, 4) | <0 


holds on a set of values of t of measure a>O0, then to every «, 0<€<a, there corre- 
spond a set AcI (independent of x) of measure a—e, a sphere K in E, and a 
number M, such that 


| = 1, 2,---) 
for every x< K and all tc A with the exception at most of a set of values of t of 
measure zero (which might depend on x). 


Let & be a sequence of measurable subsets of the set J everywhere dense 
in the space of all measurable subsets of J. Let {A} be a sequence of all sets 


* This exceptional subset generally depends on n. 
tT L.e., the set of the characteristic functions of sets of 2 has to be everywhere dense in the space 
of all measurable characteristic functions defined over I (see §2). 


[April 


1933] ON SOME FUNCTIONALS 553 


in 2% whose measure is 2a—e. Denote by H,, the set of elements x ¢ Z such 
that 


| &,(x, t) | sm 
almost everywhere in A,,. Then 
He 


H is by assumption of the second category. Hence, there exists a number 
such that Hy is also of the second category. On the other hand, since the 
transformations (4.1) are continuous the sets H,, are closed, and therefore 
Hy contains a sphere K. It is easily seen that the set Ay, the sphere K, and 
the number M satisfy the required conditions. 

TueoreM 3.* If {&,(x, t)=F,(x)} (xe E, &,¢S, is a sequence of 
linear transformations in a metric complete and linear space E, then there 
exists a set A CI such that 


(i) Tim | 4)| < 


at almost every point tc A and for every xc E; 


(ii) Tim | #)| = 


at almost every pointtc I —A, and for every x c E with the exception at most of a 

set of the first category in E; 

(iii) the transformations F(x) are equally continuous with respect to the set A.f 
Let ao be the upper bound of all numbers a such that there exists a set 

H (a) of the second category in E with the property that for every x ¢ H(a) 


(4.2) Tim | < 


in a subset of J of measure a.{ Theorem 3 is trivial when ag=0. Hence we 
may assume ao>0. Then, by the preceding lemma, there exists for every p 
a sphere K,, a number M, and a set A,cJ of measure 2ao—1/p such that 
for every xc 


(4.3) | #)| My 


* This theorem has been proved in our previous paper (Saks [1]) under the assumption that 
there exist an everywhere dense set EZ; in E such that the sequence { F,(x)} converges for every 
zc Ei. 

¢ A may be empty or coincide with the whole set I. 

t This subset in general depends on zx. 


(m = 1,2,---) 

(mn = 1,2,---), 


554 STANISLAW SAKS 


almost everywhere on A,. Let 
(4.4) A= DA». 


We shall prove that the set A has properties (i), (ii), (iii) above. 

First, the inequality (4.2) holds for almost every t¢ A, and every 
x ¢ K,, and so, by the linearity of the space E, for almost every tc A, and 
every «cE, 


Tim | &(x, #)| < ©. 


Finally, by (4.4) this holds for almost every t¢ A and every xc E. Hence, 
property (i) is established. 

In order to prove (ii) suppose that there exist a set H of the second cate- 
gory in E such that for every x ¢H the relation (4.2) holds in a subset of 
I—A of positive measure. Then for every x ¢ H this relation would hold in a 
subset of J of measure >meas A =a, which contradicts the definition of the 
number ao. 

Finally, let r, be the radius of the sphere K,, and let xo be an arbitrary 
element in E such that ||o|| <7,/(pM,). It easily follows from (4.3) that 


| pao, t)| < 2M, (n = 1,2,---), 


and therefore 


2 
| 4)| — 


for almost all ¢ ¢ Ay, i.e. everywhere in A with the exception at most of a set 
of measure Smeas (A —A,) <1/p. Since # is an arbitrary positive integer, 
this proves property (iii). 

5. The theorem of the preceding section, with the exception of property 
(iii), may be easily extended to the operations considered in the space R(§2). 
The proof remains essentially the same, only instead of the linearity of the 
space E we should use the properties of the space R as stated in the lemma 
of §2. Property (iii) obviously fails for the space R. However we have the fol- 
lowing theorem: 


TueoreM 4. If {&,(x,?)=F (x) } (2 ¢S,t¢1) is a sequence of linear trans- 
formations defined either in a linear metric complete space E or in the space R, 
then there exists a set B cI with the following properties: 

(i) The functions F,(x) are equally continuous and the sequence{ F(x) } 
converges with respect to B for every x in E (or in R). 


[April 


1933] ON SOME FUNCTIONALS 555 


(ii) The sequence {F,(x)} diverges with respect to anv subset of I—B of 
positive measure for every x in E (or in R) with the exception at most of a set of 
the first category in E (or in R).* 


We shall prove this theorem for the space R, the proof for the linear spaces 
being even simpler. We shall use the following lemma which is itself a 
generalization of the theorem of Hahn of §2. 


Lemma. If 
= 


(where C is a measurable set) is a sequence of linear transformations in the space R 
converging with respect to C on a set of the second category H ¢ R, then F,(x) are 
equally continuous with respect to C (and therefore lim, F,(x) is also a linear 
transformation in R with respect to C). 


The proof runs exactly as that of Hahn’s theorem. We can suppose again 
that the sequence {£,(x, #)=F,(x)} converges in H to zero. Then, for every 
e>0, there exist a number m» and a sphere Ky in R, of radius 7, such that 


for every x ¢ Ky and m2mo. From (i) and (ii) of the lemma of §2 it readily 
follows that de(x1, x2) Sr implies ||F.(x:) —F.(x1)||c<« for every two points 
%1, X2 in R. This proves our lemma. 

We now proceed to the proof of Theorem 4 (for the space R). Denote by 
ao the upper bound of numbers a=0 with the property that there exists a set 
B(a) ¢I of measure a such that the given sequence {F,(x)} converges with 
respect to B(a) for all x c R. Let B be the sum of all sets B(a), aa. The set 
B has both properties (i) and (ii). Indeed, property (i) follows immediately 
from the definition of B and from the preceding lemma. It remains to 
establish property (ii). Suppose that for all x of a set of the second category 
in R, {F,(x)} converges with respect to a subset of J — B of positive measure; 
this subset depends generally on x, but, by the same argument as in the proof 
of the lemma of §4 we can determine a fixed set C c J — B of positive measure 
such that {F,(x)} converges with respect to C for all x of a set of the second 
category H c R. There exists a sphere Kp in which H is everywhere dense. By 
the preceding lemma the transformations F,,(x) are equally continuous with 


* Hence, to any sequence of transformations {F,(x)} in a linear space E there correspond two 
sets A and B (as defined by Theorems 3 and 4). We have obviously B C A (with the exception at 
most of a set of measure zero). In a special case, if there exists an everywhere dense set E, in E such 
that the sequence { F,(x)} converges for every x C E; (at least with respect to the set B), it follows 
easily from the property (iii) of Theorem 3 that B=A (Saks [1], Theorem 6). This obviously is 
not true for the space R. 


556 STANISLAW SAKS 


respect to C. Hence, the sequence { F(x) } converges everywhere in Ko. Then, 
from parts (ii) and (iii) of the lemma of §2, it easily follows that this sequence 
converges (with respect to C) everywhere in R. Thus it converges everywhere 
in R with respect to the set B+C cI. Since, by assumption, meas C>0 and 
therefore meas (B+C)>meas B=ao, this contradicts the definition of ao. 
Hence our theorem is proved completely. 

6. Conclusion. The theorems of §§4 and 5 may be applied to sequences of 
transformations of the form F,(x) =/>K,(s, t)x(s)ds =£,(x, 4). For instance* 
if {K,(s, t)} is a sequence of summable functions (or, more generally, sum- 
mable with respect to s for almost every value of ¢) and if the sequence 


&,(x, t) = F,(x) = sxc t)x(s)ds 
0 


converges for all measurable characteristic functions x(s) (or merely on a set 
of the second category in R) then the operations F,(x) are equally con- 
tinuous.t Therefore, if a sequence of integrals /pK,(s, #)ds converges in 
measure in (0, 1) for every measurable set P and if there exists a function 
f(t) such that />K,(s, #)ds converges in measure to f(é) for all a, O<a<1, 
then /pK,(s, #)ds converges in measure to f(#) for all measurable sets in the 


interval (0, 1). 


REFERENCES 


Banach. 1. Sur la convergence presque partout de fonctionnelles linéaires, Bulletin des Sciences 
Mathématiques, vol. 50 (1926), pp. 27-32, 36-43. 

Banach, 2. Teorja operacyj linjowych (in Polish), Warszawa, 1932. 

Banach et Steinhaus, 1. Sur le principe de la condensation des singularités, Fundamenta Mathe- 
maticae, vol. 9 (1927), pp. 50-61. 

Hahn. 1. Uber Folgen linearer Operationen, Monatshefte fiir Mathematik und Physik, vol. 32 
(1922), pp. 1-88. 

Saks. 1. Sur les fonctionnelles de M. Banach et leur application aux développements des fonctions, 
Fundamenta Mathematicae, vol. 10 (1927), pp. 186-196. 


* Cf. analogous examples in Banach [1]. 
t If Kn(s, 4) reduces to a function of one variable s we find again the theorem of Hahn. 


Brown UNIVERSITY, 
PROVIDENCE, R. I. 


BOOLEAN ALGEBRA. A CORRECTION 


BY 
EDWARD V. HUNTINGTON 


In my paper in these Transactions for January, 1933, the Example 4.5 
on page 286 is erroneous, and Postulate 4.5 on page 280 is in fact redundant. 
Hence the “fourth set” of postulates for Boolean algebra, on the base 
(K, +, ’), should read as follows (the class K being understood to contain at 
least two distinct elements) : 

PostuLaTE 4.1. If a and b are in K, then a+b is in K. 

PostuLaTE 4.2. If a is in K, then a’ is in K. 

PosTULATE 4.3. a+b=6b+a. 

PostuLaTE 4.4. (a+6)+c=a+(b+<c). 

PostutaTE 4.6. (a’+b’)’+(a’+b)’=a [or, ab+ab’ =a, where, by defini- 
tion, ab =(a’+b’)’]. 

The steps by which the proposition 4.5 (¢+-a =a) is deduced as a theorem 
from Postulates 4.1, 4.2, 4.3, 4.4, and 4.6, are as follows.* 

4.10. a’’ =a. (Proof as on page 281.) 

4.11. ata’=b+0’. 

Proof (without using 4.5). By 4.6, with 4.3 and 4.4, 
a+a'= [(a’ + + (a’ + b’)’] [(a”’ +)’ + (a”’ + b’)'] 

[(b’ a’’)’ (b’ +a’)’] + a’’)’ +a’)'] = +3’. 

4.12. Definition. U =a+a’ =the “universe element” of the system. 

In particular, V=U+U’. 

4.15. a+ U' =a. 

By 4.6, 4.10, 4.12, 

(a) U’ =(U + U)'’+(U 4+ U"’)’ =(U + U)'+U". 

By 4.12, (a), 4.4, 4.12, 

=(U+U"’)+(U+U)’ =U+(U4+ 0)’. 

Hence by 4.4, 4.12, 

(b) =(U+U0)+(U4+ 0)’ =U. 

From (a), (b), U’=U’+U’", whence by 4.12, 


* For an essential step in this proof I am indebted to Mr. B. Notcutt, a Commonwealth Fel- 
low from Oxford University, at present a graduate student in Harvard University. The following 
article appeared after my brief bibliography was completed: W. V. Quine, A note on Nicod’s postulate, 
Mind, vol. 41 (1932), pp. 345-350. 

557 


| 
| 
| 
| 
| 
| 


E. V. HUNTINGTON 


(c) (a’ + a)’ = (a’ + a)’ + (a’ + a)’. 

By 4.12, 4.6, 4.4, 

a+U’ =a+(a' +a)’ = [(a’ +2’)’ + + +0)’ 
= (a’ + a’)’ + [(a’ +)’ + +2)’], 
whence by (c), 4.6, 
a+U’ =(a’'+a’)’+ +a)’ =a. 
4.5. a+a=a. 
By 4.15, 4.3, 4.12, 4.6, 4.10, 
(a+a)’ = U'’+(a+a)’ =(a+a’)’+ (a+a) =a’. 

Hence, by 4.10, a+a=a. 

The number of postulates in the “fourth set,” as thus corrected, is no 
larger than the number in the “fifth set.” 


HARVARD UNIVERSITY, 
CAMBRIDGE, Mass. 


ON A SPECIAL CLASS OF POLYNOMIALS* 


BY 
OYSTEIN ORE 


In the present paper one will find a discussion of the main properties of a 
special type of polynomials, which I have called p-polynomials. They permit 
several applications to number theory and to the theory of higher congruences 
as I intend to show in a later paper, and they also possess several properties 
which are of interest in themselves. 

The -polynomials are defined in a field with prime characteristic p 
(modular fields); they form a (usually non-commutative) ring, where ordi- 
nary multiplication is replaced by symbolic multiplication, i.e., substitution 
of one polynomial into another. The p-polynomials are completely charac- 
terized by the property that the roots form a modulus. This modulus has a 
basis, and one shows consequently that the p-polynomials will have a great. 
number of properties in common with differential and difference equations, 
such that the theory of p-polynomials gives an algebraic analogue to the 
theory of linear homogeneous differential equations. One finds that the 
theorems on the representation of differential polynomials will hold also for 
p-polynomials; the decomposition in symbolic prime factors is not unique, 
but the factors in two different representations will be similar in pairs. One 
can introduce the system of multipliers and the adjoint of a p-polynomial and 
even the Picard-Vessiot group of rationality; it corresponds in this case to 
a representation of the ordinary Galois group of the p-polynomial by means 
of matrices in the finite field (mod »). When this representation is reducible, 
the p-polynomial is symbolically reducible and conversely. 

In this paper I have given only the fundamental properties in the theory 
of p-polynomials; various interesting problems could only be mentioned, 
while most applications of the theory had to be reserved for another com- 
munication. There are a few applications to higher congruences in §5, 
chapter 1, giving new proofs for theorems by Moore and Dickson; in §6 I 
give a new and simplified proof for the theorem of Dickson on the complete 
set of invariants for the linear group (mod p). The invariants are, as one will 
see, the coefficients of a certain p-polynomial, and a slight generalization of 
the proof of the fundamental theorem on symmetric functions gives the 
desired result. 


* Presented to the Society, February 25, 1933; received by the editors February 3, 1933. 


559 
en: ERSITY 
COLLEGE O* LIDERAI ARTS 
LIBRA? 


OYSTEIN ORE [July 


CHAPTER 1. PROPERTIES OF p-POLYNOMIALS 


1. Definition of p-polynomials. Let K be an arbitrary field of characteristic 
p where # is a rational prime. In many of the most important applications K 
is a finite field, but this will not be assumed at this stage.* 

A polynomial of the form 


(1) F,(x) = ax?” + Om 1X? + dmx 

with coefficients in K shall be called a p-polynomial; the number m is called 
the exponent of F,(x). When ao=1, F,(x) is said to be reduced. At times 
polynomials of the form 


(2) Gps(x) = + + amt 


will be considered; their properties are quite analogous to those of p-poly- 
nomials (1). 

The p-polynomials form a modulus, since they are reproduced by addition 
and subtraction. The pth power of a p-polynomial is again a p-polynomial. 

The product of two p-polynomials is not a p-polynomial. It is however 
fundamental that a new symbolic multiplication can be introduced such that 
the product of two p-polynomials is again a p-polynomial. This multiplication 
is usually not commutative so that the p-polynomials will form a non- 
commutative ring. 

Let namely 


be a second -polynomial; we then define the symbolic product F,(x)G,(x) as 
(4) F(x) X G,(x) = FpG,(«)) 

and correspondingly G,(x) XF,(x) =G,(F,(«)). It follows that 


(5) F,(x) X G,(x) = cox? we + Crpm%, 


where 


abo”, 
= ach?” + , 
= dob?” + abe” + 


* For several of the following theorems it is not even necessary to assume that the coefficient 
field is commutative. 


560 
Co 
“1 
(6) C2 
Cr+m = Ombn. 


1933] A SPECIAL CLASS OF POLYNOMIALS 561 


The exponent of a product is the sum of the exponents of the factors. 

One immediately observes, that the theory of ~-polynomials is a special 
case of the theory which I have discussed in the paper Theory of non-commu- 
tative polynomials.* One has only to introduce the correspondence 


giving in general 
F(x) aoy™ + +++ + + Om; 
to recognize the formal identity of the two theories. Since 
x? X ax = ya = ay 


one sees that the two operations conjugation and differentiation in the general 
theory are here simply 


=a?, a’ = 0. 


From the general theory one can now deduce a great number of facts: 
In the ring of -polynomials the symbolic multiplication is associative and 
distributive with respect to both right-hand and left-hand multiplication. 
The unit element is Z,(x)=x and there are no divisors of zero, i.e., an 
identity A,(x)B,(x) =0 implies A,(x) =0 or B,(x) =0. 

A p-polynomial F,(x) is said to be symbolically right-hand divisible by 
D,(x) if F,(x) =Q,(x) XD,(x). One observes that when F,(x) is right-hand 
symbolically divisible by D,(x), then F,(x) is also divisible by D,(x) in the 
ordinary sense. When F,(x) =D,(x) XQ,(x) we say that F,(x) is left-hand 
symbolically divisible by D,(z). 

Let us now consider division for p-polynomials; supposing m =n in (1) 
and (3) one finds that the differences 


F,(x) — aobo-?” x?” 


F, (x) — Gy(x) X 
do not contain any terms of higher degree than x” ~’. It follows, by repetition 
of this process, that one can write 

F,(x) = Q(x) X G(x) + R,(x), 


7 
F,(x) = G,(x) X P,(x) + S,(x), 


where the exponents of R,(x) and S,(x) are smaller than m. The coefficients 


* To appear shortly in the Annals of Mathematics. This paper will be quoted as Ore I. 


562 OYSTEIN ORE [July 


of R,(x) are all in K, while the coefficients of S,(x) lie in some radical field 
over K. 

THEOREM 1. Symbolic right-hand division of polynomials is always possible, 
while symbolic left-hand division can only be performed in K, when K is perfect.* 

When left-hand divisibility is discussed in the following we shall always 
assume that K is perfect. 

Theorem 1 shows that right-hand (and left-hand) Euclid algorithms 
exist, and this shows in turn the existence of a unique (reduced) cross-cut 
(F,(x), Gp(x)) =D,(x). When D,(x) =x we say that F,(x) and G,(x) are right- 
hand symbolically relatively prime, and we can then find such polynomials 
A,(x) and B,(x) of exponents less than m and m respectively that 


(8) A,(*) X Fp(x) + X = x. 
We shall finally prove the following theorem: 


THEOREM 2. The symiclical right-hand cross-cut of F,(x) and G,(x) is 
equal to the ordinary cross-cut of these polynomials. 

This follows from our former remark that every symbolic right-hand 
divisor is also an ordinary divisor of a polynomial and the symbolic Euclid 
algorithm can therefore also be considered as an ordinary Euclid algorithm. 

2. Linear factors. Let us now find the condition that a p-polynomial (1) 
be divisible symbolically by a linear factor x? —ax. One finds easily 


F(x) = Q,(x) (2? — ax) + Ax, 
where 
(9) A = ?™ 4 gua? + + dm + + Om. 


THEOREM 3. The necessary and sufficient condition that the linear p-poly- 
nomial x? —ax be a symbolic divisor of F ,(x) is that a be a root of 


(10) 4 ayyP™ tam ey?t! + + dm = 0, 
i.e., a is equal to the (p—1)st power of a root of the equation F,(x) =0. 
One can in the same way find the necessary and sufficient condition that 


F,,(x) be left-hand divisible by x? —ax. The result is, in this case, a little more 
complicated, namely a must be a root of the equation 


(11) a y +a y +--: 


(p+l)/p 1/p 
+ + + ain = 0. 


* Compare Theorem 6, chapter I, Ore I. 


1933] A SPECIAL CLASS OF POLYNOMIALS 563 


From Theorem 3 follows immediately that every p-polynomial will de- 
compose into linear symbolic factors in some finite algebraic extension of K. 
We shall discuss this decomposition later on. 

For the product of linear factors one finds 


(x? + (x? + ayx) = + (ay? + ae) x? + @102x, 
and the following theorem can be proved by induction: 
THEOREM 4. We have 
(x? + Gnx) X X (x? + (x? + 
where 
where the sum is to be extended over all s and a such that 


Sta, =n—it+r. 


In the simplest case where all a’s are equal to one, it is seen that 


n nN n—1 nN n—-2 n 
(x? + = xP +(*)» tit 


3. The roots of p-polynomials. The roots of p-polynomials have several 
interesting and characteristic properties. Let us consider an equation 
(12) F,(x) = 0; 
it is obvious that x =0 is always a root. Furthermore if w; and w, are roots, it 
is seen without difficulty that w:+w,. are roots. 

THEOREM 5. The roots of an equation (12) form a finite modulus. 


When a,,+0 we find F/ (x) =a,0 and the equation (12) cannot have 
equal roots. The corresponding modulus must have finite basis and we can 
state 


THEOREM 6. When an+0 the roots of F,(x) =0 form a finite modulus M of 
rank m. There exists a basis 


(13) @1, We,°** »Wm 
for M, such that every root is uniquely representable in the form 
(14) w= kort---+ kmwm = 0,1,---,p— 1). 


564 OYSTEIN ORE [July 


A modulus of the form (14) we shall call a p-modulus. It can be shown 
that m roots (13) form a basis for M if and only if 


(15) A(w, Wm) 


does not vanish. 
It should be observed at this point, that if one considers the root of a 


p’-equation G,,(x) =0, where G,y(x) is given by (2), the roots will also form a 
modulus M and one can find a basis 


Q2,- Qm 


such that every root is representable in the form 
Q = + + 
where the x; run through all the elements of a finite field with p/ elements. 
4. Polynomials with given roots. We shall next consider the inverse prob- 
lem: Given a p-modulus M,™ of rank n; to construct a p-polynomial F(x) 
of exponent n having the elements of M,” for roots. Let w:, - - - , w, be a basis 
for M,™ . When n=1 we find simply 


(16) F(x) = x(% — wi)(% — (x — — 1)w1) = — 


The general expression can now be found by induction. Let F,(x) be the 
p-polynomial having the roots 
Rywy Rp = 0,1,---,p—1). 


The elements of M,™ will then satisfy the equation 
F(x) = — Wn) — — 1)wn) = 0, 
and since all occurring polynomials are ~-polynomials, 
F,(%) = — + — (2 — 1)Fr-1(@n)), 
or finally, as in the case »=1, 
(17) F(x) = Fy—-i(x) — 


which shows that F,(x) also is a p-polynomial. Using symbolic multiplica- 
tion, we can write F,(x) in the form 


(18) = (x? — X 
This gives by repeated application 


We “ee Wm 

We? cee WnP 
m—1 m—1 m-1 

We? Wm? 


1933] A SPECIAL CLASS OF POLYNOMIALS 565 


THEOREM 7. The p-polynomial F,(x) having the elements of a p-modulus 
M ™ for its roots can be written 


(19) F,,(x) (x? F,-1(@n)?!x) x (x? — 4 (x? 


where uw, + - - ,@, 4s an arbitrary basis for M,™ . One has also the formula 


(20) 


where A denotes the determinant defined by (15). 


It is obvious that the polynomial (20) has w, - - - , w, and hence all ele- 
ments of M, for its roots. 


THEOREM 8. The necessary and sufficient condition that the roots of a poly- 
nomial form a modulus is that the polynomial be a p-polynomial. 


The modulus must be finite, and the field of the coefficients must con- 
sequently have the characteristic ». The theorem then follows from Theorems 
6 and 7. 

5. Applications to higher congruences. The results of §4 immediately give 
various theorems on congruences (mod #). 

From the definition of F(x) and from (20) follows 


(21) — = [I (kwit---+ (mod 


Wn) i=1 


which is a generalization of well known identities in higher congruences. 
When one compares the last term in x on both sides one obtains the following 
generalization of Wilson’s theorem: 


THEOREM 9. Let M,”) be a finite modulus (mod p) and let w:, - - + , @n be 
a basis for the modulus ; then 


(22) = (— 1)"A(@, (mod 


where w 0 runs through all elements of M,. 


Let us finally apply the formula (21) to the case of m—1 basis elements 
* , and let us put x=w,. This gives 


n—1 p—1 


A(w, Wn) = A(w1, Wn—1) II II + Rn—-1Wn-1 +° + ky) 
kj=0 
(mod p) 
and we have a simple proof of a theorem by E. H. Moore.* 
* E. H. Moore, Bulletin of the American Mathematical Society, vol. 2 (1896), p. 189. 


A(w, x) 
|| (x) 


566 OYSTEIN ORE 


THEOREM 10. The following identity holds: 


p—l p-1 
(23) A@w:,---,.) = [J t+ +--+ ++ (mod 9). 
ky=0 
It can be stated by saying that A(w, - - - , w,) is congruent to the product 
of all different linear expressions in the w;, considering two such expressions 
equal if they are proportional. 
Another result is the following: 
Let H,(x) =A,(x) XB,(x); then 


H,(x) = [](B,(2) +») 


where w runs through the modulus of all roots of A,(x) =0 and the product sign 
indicates ordinary multiplication. 

This simple remark contains and generalizes various theorems on higher 
congruences by Mathieu* and Dickson. 

6. The invariants of linear groups (mod ~). We shall now consider the 
symmetric functions of the roots of a p-polynomial. From (20) it follows that 
the p-polynomial corresponding to given modulus M,™ has the form 


(24) F(x) = 2?" Aya?” + Ane, 


where 


(25) 


and where A‘(w:, - - - , w,) denotes the minor of the term x” in the deter- 
minant A(w:,---, Wa, x). Every symmetric function of the elements of 
M;™ can therefore be expressed by the rational function (25) of the w. 

We shall now consider the inverse problem: When is a rational function 
, %,) asymmetric function of the p*—1 linear forms 


(26) X1y Za) Ry xy + + RnXn (ki = 0, 1), 
the combination k:= - - - =k,=0 excluded. We shall prove 


THEOREM 11. The necessary and sufficient condition that F(x, ---, Xn) 
be a symmetric function of the linear forms (26) is that F(x, --- , xn) be an 
absolute invariant of the full linear group of order n (mod ). 


When F(x, - - - , x) is representable as a symmetric function of the forms 


* E. Mathieu, Journal de Mathématiques, (2), vol. 6 (1861), pp. 241-323. 
t L. E. Dickson, Bulletin of the American Mathematical Society, vol. 3 (1897), pp. 381-389. 


[July 


1933] A SPECIAL CLASS OF POLYNOMIALS 567 


(26) it is representable by the coefficients A; in (24). From the representation 
(25) it is easily seen that these coefficients are absolute invariants by all 
linear substitutions of the w; with non-vanishing determinant (mod 9). 

To prove the converse, let F(x, - - - , #,) be an absolute invariant; one 
can assume, without loss of generality, that F(m,+--, %,) is integral. If 
we write 


(27) F(x1, = > , 
é 


the B(x2, - - - , x,) must be absolute invariants of the linear group on the 


(m—1) variables x2, - - - , Xn. Let us put 
n-l 
(28) = [JI (mt + = 


where the coefficients of A as polynomial in x; are also invariants of the group 
in m—1 variables. We can now divide F(m,---, %,) by the powers of A 
and obtain a representation of the form 

(29) F(x1,° ++, = + + - - - + Ro(x1), 


where the coefficients R,(x1) are polynomials of degree smaller than the 
degree of A in x, and with coefficients which are invariants in the m—1 re- 
maining variables. We shall now show that x; does not occur in any R;(x;). Let 
us suppose namely that 

(30) Ri(x1) = So(x2,-- , Xn) + tn) 

It follows from the representation (28) that A is invariant under an arbitrary 
substitution of the form 

+ + RnXn; ky # 0, 


(31) 


Since the representation (29) is unique, all coefficients in (29) must also be 
invariant under the substitutions (31). From (30) we obtain, however, 


Ri(x1) — So(x2, = K(x), 


and applying all substitutions (31) to this identity we find that the difference 
Ri(x1) —So(x2, - - Xn) is divisible by A, giving =So(x2, ---, 
This gives the special form 


(32) F(x1,--+ %n) = Ri(xa,--- , +--+ + , Xn) 


for the representation (29). 
The remaining part of the proof is analogous to the proof for the principal 


568 OYSTEIN ORE [July 


theorem on symmetric functions. The terms of F(x, - - - , x,) are arranged in 
decreasing order as usual in this proof, and we assume that 


(33) +++ HF", a, a2 2 an, 


is the principal term. a is then the highest exponent of any power of x; 
which occurs in F(m, +--+, %,); according to (32) a, must be divisible by 
p"—p""; a. is the highest power of x2 contained in the invariant 
R.(x:,--+,%,) and it is therefore by the same reason divisible by 
p"-!— p"-? etc. It follows that the principal term (33) must have the form 


n—2 


= 


The invariant A; in (25) has the principal term 


and the difference 
F(x, Xn) = (+ aA 44-4 


only contains terms lower than (34) and one obtains a representation of 
F(a, ---, %n) by the A; through repetition of this process. It also follows 


that if 
F(x, ee Xn) = R(A,, 

is the representation of the integral invariant F(x, - - - , x.) then the coeffi- 
cients of R belong to the ring generated by the coefficients of F. 

An immediate consequence of this proof is 

THEOREM 12. The polynomials 
tn) 

A(x, Xn) 


(35) %) = 


form a fundamental system for all the absolute invariants of the linear group of n 
variables (mod 

A relative invariant of the linear group is an expression G(m, -- - , %n) 
which is only multiplied by a power of the substitution determinant by a 
linear substitution (mod /); A(m, ---, %n) is a relative invariant and by 
multiplying by a suitable power of A(x, - - - , %,) one obtains a very simple 
proof of a theorem by Dickson*: 


* L. E. Dickson, these Transactions, vol. 12 (1911), pp. 75-98. 


1933] A SPECIAL CLASS OF POLYNOMIALS 


THEOREM 13. The polynomials 


form a fundamental system for all relative invariants of the linear group on n 
variables (mod ?). 


The polynomial A,(%, - - - , xn) in (36) has been omitted since 
A,(*, Xn) = A(x, 


Dickson has proved Theorem 13 for the somewhat more general case 
in which the linear group is supposed to have coefficients in an arbitrary finite 
field. Our proof holds with slight modifications also for this case. In the same 
paper Dickson considers the “Formenproblem” of the invariants: i.e., the 
problem of finding the values of the variables x; for which the invariants 
assume prescribed values. From our point of view, this is identical with the 
problem of solving the equation defined by the corresponding p-polynomial, 
a problem which has already been discussed at some length. 

7. The resultant. An important invariant of two -polynomials F,(x) 
and G,(x) defined by (1) and (3) respectively is the so-called p-resultant 
R,(F p(x), Gp(x)). Let 

be the basis elements of the two corresponding p-moduli; the determinant 


A(wi, - ++, @m, ¥1, -* +, Wn) is then according to (22) equal to the product 
of all possible different linear combinations 


(37) + + + by 


in which not all coefficients vanish, and where two expressions (37) are con- 
sidered to be equal if one can be obtained from the other through multiplica- 
tion with a rational integer. We then define the p-resultant of F,(x) and 
G,(x) by putting 

This resultant is, we see, the product of the differences of all non-vanishing 


roots of the two polynomials, considering as before two differences w—y and 
k(w—w) as being equal. It is therefore 


(38) R,(F,G) = 


R,(F,G) = 
x 


G,( 


x 


569 


570 OYSTEIN ORE [July 


where R denotes the ordinary resultant. I mention without proof that the 
p-resultant of F,(x) and G,(x) can be represented in the form 

One can also find a representation of R, by means of the coefficients of F,(x) 
and G,(x). 

8. The adjoint of a p-polynomial. To a given modulus VM, we construct 
the adjoint modulus 7,” 


(39) R,(F Gz) 


A(we, Wn) A(w, W3,°" Wn) 

» = (— 1)**? 
A(w, Wn) A(w, Wn) 


— 
Wn) 


@, = (— 1)" 


(40) 


We show simply that these numbers are linearly independent and therefore 
can be regarded as the basis of a modulus 7,™. 

In §4, we have found that the reduced polynomial F,(x) having the 
modulus M™ for roots will be left-hand divisible by x” —8x, where according 
to (18) and (20) 


— _ 


By changing the order of the basis elements w; of VM, one can deduce in the 
same way that F,(x) is left-hand divisible by all factors , 


xP — a; 


and finally also by all factors 


xP — @-@-Dy, 


where @ is an arbitrary element of the adjoint modulus 7,» . 
Let on the other hand G be an element such that 


(41) F(x) = (x? — x) X Q,(z). 


The roots of Q,(x) will form a submodulus M,*-» of M,, and since a basis 
of M,"-» may be completed to a basis for M,{™ we see that @ must be an 
element of M7, . This leads to the following result which may also be used 
as a definition of 7, : 

THEOREM 14. The adjoint modulus M to M$ consists of all elements @ 
such that the corresponding p-polynomial F,(x) to M” has a decomposition of 
the form (41). 


1933] A SPECIAL CLASS OF POLYNOMIALS 571 


We shall express this result in a somewhat different form. From (41) we 
obtain 


(42) = ((x@)? — ax) X = (x? — x) X ox X Q(z). 
An element « such that 
(43) KF - (x? —x)X R,(x) 


shall be called a multiplier of F,(x). It is obvious that the multipliers form a 
modulus, and from (42) and Theorem 14 we find 


THEOREM 15. The multipliers of a polynomial F,(x) form a modulus 
Nx™ of rank n which is equal to the modulus of the pth powers of the adjoint 
modulus M™ to the modulus M of the roots of F,(x) =0. 


Let us now determine the -polynomial corresponding to the adjoint 
modulus M7,” or to the modulus V, of the multipliers, which is virtually 
the same problem. If F,(x) is left-hand divisible by x°—§x, then 
B=x-<»-)/», where « is a multiplier, and the condition (11) for left-hand 
linear factors gives 


THEOREM 16. The multipliers of 
(44) F,(x) = Aye?” + Age 
are the roots of the equation 
(45) = + (Anat) + (Aix)? +2 = 0. 


Since the roots of x?—x=0 are 0,1, - - - , p—1, we observe the following 
result: If x is a multiplier of F,(x) giving the decomposition (43), then 


(46) KF (x) = T1(Rp(2) + 4) 
i=0 


where the product sign denotes ordinary multiplication. 

We have in the preceding supposed F(x) to be reduced. If F,(x) in (44) 
has the highest coefficient A» then the multipliers will be «/Az', where 
x is a multiplier of the corresponding reduced polynomial. 

In general we shall call the polynomial 


(47) F,(z) (Anx)” + +--+ + (Aix)? + Aox 
the adjoint of F,(x). The adjoint of F,(x) is 


= + Ap x? + + AP = X F,(x) X 


572 OYSTEIN ORE 


and for the adjoint of a product one finds 


X G(x) = x” XG,(z) X x” X F,(2). 


It may be more simple to introduce fractional powers and define the ad- 

joint polynomial by putting 
= Ant + + (Ara)? + 

This expression has the same roots as (47) and it has the simpler properties 
that the adjoint of a sum is the sum of the adjoints, the adjoint of a product 
is equal to the product of the adjoints in inverse order, and also simply 
F,(x) =F,(x). 

Let us finally determine when F,(x) =F,(x), using the definition (47). 
We obtain the relations 


(48) 


and also 


giving 
(¢=0,1,---,m). 


THEOREM 17. When a polynomial F(x) is self-adjoint, all coefficients must 
belong to a finite field of p” elements; in addition the relations (48) must hold. 


CHAPTER 2. FORMAL THEORY 

1. The union of p-polynomials. Let F(x) and G,(x) be two p-polynomials 

given by (1) and (3); the reduced polynomial 
M,(x) = [F G,(x)] 

of smallest degree with coefficients in K which is right-hand symbolically 
divisible by both F,(x) and G,(x) is called the least common multiple or the 
union of F,(x) and G,(x). From the existence of a Euclid algorithm the exis- 
tence of the union follows; it has the exponent m-+n—d, where d is the ex- 
ponent of the cross-cut D,(x) =(F,(x), G,(x)). 

Let as before 

(m) (n) 


(1) M, = (w1,--+,@m), = (Vi, 


be the basis of the two moduli formed by the roots of the two polynomials 
F,(x) and G,(x). The modulus corresponding to D,(x) is then the modulus 
formed by the common elements of M,‘” and V,™ while the modulus of the 


union is 


[July 
| 
yA 
=A; 
i =A 
++, n) 
0, 1,- 


A SPECIAL CLASS OF POLYNOMIALS 


(n+m—d) 


When F,,(x) is relatively prime to G,(x) we find 
- Wm, W1,°°* x) 
A(wi, - Wm, Wn) 


because the right-hand side is a reduced polynomial having the same roots 
as the union. For the same reason we see that also 


A(F (v1), F,(x)) 
F(x), Gp = 

= AG, ,Gy(wm), G,(x)) 
AG, (a1), - - , Gp(wm)) 


(2) [F p(x), Gp(x)] = 


(3) 


represent the union. 

As an application let us determine the union of a reduced polynomial F(x) 
and a linear factor x?—ax. Since the roots of the latter are ka!/(?- (k=0, 
1,---, p—1), we find, using formula (17), chapter 1, 


M,(x) = F(x)? — (x). 
If we put 


a simple reduction shows 


THEOREM 1. The union of a reduced polynomial F(x) and a linear poly- 
nomial x” —ax is 


M,(x) = F,(x) — ap(a)?'F(x) 
(5) = + (AP — ag(a)?)x + (AP — 
ag¢(a)?A,x, 
where (x) is defined by (4). 
2. Transformation of p-polynomials. The existence of a union for two 
arbitrary p-polynomials permits us to introduce a new operation on p-poly- 


nomials, which we shall call transformation. 
The polynomial 


(6) AS (x) = aobo?” “[Ap(x), By(x)] X By(x)! = 


is called the transform of A,(x) by B,(x). The notation is such that A,(x) has 
the exponent ”, B,(x) the exponent m, while d is the exponent of the cross-cut 


1933] 573 


574 OYSTEIN ORE [July 


(7) D,(x) = (Ap(x), By(x)), = Ap(x)D,(x), By(x) = B,(x)D,(x). 
Finally, ao and bo are the highest coefficients of A,(x) and B,(x) and the 
numerical constant in (6) is chosen such that the transform has the same 
highest coefficient as A ,(zx). 

When 4A,(x) is relatively prime to B,(x), we say that (6) is a special 
transformation; the transform has then the exponent m. When a cross-cut 
D,(x) exists we call the transformation general, and the transform has the 
exponent m —d. The general transformation can always be reduced to a special 
transformation, since it follows from (6) and (7) that 


(8) B,A,(x)By' = acb? [A,(x), Bp(x)] X B,(x)-! = B,A,(x)B;". 

When the polynomial A,” (x) is obtained from A,(x) by a special trans- 
formation, we say that A,‘ (x) is similar to A,(x). It can be shown that the 
notion of similarity is symmetric, reciprocal and associative. 

There exist a large number of results on the transformation of p-poly- 
nomials which can all be deduced from the general polynomial theory. They 
will be given here without proof :* 

When (x)=B,» (x) (mod A,(x)) then 
(9) By Ax(x)(By = By Ag(2)(By 

Furthermore 

(10) (CpBy)A p(x)(CpBp) = 

From (9) and (10) it follows that if A,“ («)=B,A,(x)B,-!, where B,(x) 
is relatively prime to A,(x), then A,(x) =B, A, (x)(B,)-1, when B,™ (x) 
is determined such that 


By (x)B,(x) = x (mod A ,(x)), 


which is always possible according to (8), chapter 1. 
For the transformation of a union one finds simply 


(11) Cy[A p(x), By(x) = 
the corresponding formula does not hold for the cross-cut. For the transform 
of a product of reduced factors one finds 


(12) C,(By(x) X = Cy By(x)(Cy X CpA,(a)C 


where C,{» (x) =A,C,(x)A,~'. For an arbitrary number of factors one finds 


* The proofs follow from Ore I. 


1933] A SPECIAL CLASS OF POLYNOMIALS 575 


a corresponding result, which gives the theorem that the transform of a 
product is made up of factors which are similar to the factors in the original 
product. 

The following theorem has some important applications in the formal 
representations of p-polynomials. 

If a product A,(x) X B,(x) is divisible by C,(x) and C,(x) is relatively prime 
to B,(x), then A ,(x) is divisible by B,C,(x)B,—. 

Let us now consider the expression for the transform in terms of the roots 
of the polynomials. From (3) and the definition of transformation follows 


THEOREM 2. When B,(x) is relatively prime to A,(x), we have 
A(B,(w1), By(wn), x) 
A(B,(w1), --- , By(wn)) 
where the w; form a basis for the roots of A,(x). 


(13) B,A,(x)By* = ao 


When w is an arbitrary element in the modulus of A ,(x), then the modulus 
of B, A,(x)B,-' consists of all numbers B,(w) and this holds even in the general 
case. The transformation is consequently analogous to the Tschirnhausen 
transformation for algebraic equations. 

As an application let us find the transform of a linear polynomial «? — ax 
by an arbitrary polynomial F,(x). From Theorem 1 follows 


(14) F,(x? — = x? — 

One can easily determine when two linear expressions 

(15) x? — ax, x? — bx 

are similar. According to (14) every polynomial similar to a linear polynomial 


can be obtained from it by transformation with an expression cx, and there 
follows from (14) 


THEOREM 3. Two linear polynomials (15) are similar when the quotient ab-' 
=c?-1is a (p—1)st power in K. 

3. Decomposition into prime factors. We shall say that two reduced poly- 
nomials A,(x) and B,(x) are transmutable if A,(x) can be represented in the 
form 

Aj(2) = (2)B5", 
where A,‘ (x) is similar to A,(x). In this case the product 


A,(x) X B,(x) = [A$ (x), By(x)] = AS 


can be written in two ways, such that the factors are similar, but occur in 


576 OYSTEIN ORE [July 


different order. We shall say that one representation is obtained from the 
other by transmutation. As an example let us find when two linear factors are 


transmutable. Let 
A,(x) = ax, B,(x) = bx, Ao (x) =x — 62; 


then according to (14) 


B,A,(x)B, = x” — — b)” 


and ¢ must be a root of the equation 
c(c — 6)? = a. 


A prime polynomial P,(x) in K is a polynomial which has no reduced 
symbolical divisors except itself and x. Every polynomial similar to a prime 
polynomial is also prime. One can then prove the following theorem*: 


THEOREM 4. Every reduced polynomial has a decomposition into prime 
factors. Two different decompositions of the same polynomial will have the same 
number of factors; the factors will be similar in pairs by a suitable ordering, and 
one decomposition can be obtained from the other through transmutation of factors. 


It is easily seen that one cannot expect the decomposition to be unique; 
if F,(x) is an arbitrary polynomial with the exponent n, then F(x) is divisible 
by all p linear factors «? —w?-!, x, where w is an arbitrary root. 

4. Completely reducible polynomials. We shall say that a polynomial 
F(x) in K is completely reducible when it is the union of prime polynomials. 
It can then be represented by a basis 


F,(x) = [Pi(x), P,(x)] 


where each prime polynomial P;(x) is relatively prime to the union of the 
others. We can also show the following: 

The necessary and sufficient condition that a polynomial be completely re- 
ducible is that two consecutive prime factors in an arbitrary prime polynomial 
decomposition always be transmutable. 

The union of all prime polynomials, which divide an arbitrary poly- 
nomial F(x) on the right, we shall call the maximal completely reducible factor 
of F,(x) and denote by H,‘” (x). Then 


F,(x) = Fy (x) X Hy (2), 


and F, can be treated the same way; there follows 


* Ore I, Theorem 1, chapter 2. 


= 


1933] A SPECIAL CLASS OF POLYNOMIALS 577 


THEOREM 5. Every polynomial has a unique representation as product of 
maximal completely reducible factors. 


From the general theory a large number of results on completely reducible 
polynomials can be deduced.* We shall however only mention a few facts, 
which we shall apply at a later point. 

We shall say that a completely reducible polynomial is uniform, when it is 
only divisible by similar prime polynomials. The necessary and sufficient 
condition that a completely reducible polynomial be uniform is that the 
basis contain only similar prime polynomials. 

Let F,(x) now be an arbitrary completely reducible polynomial; the union 
of all prime divisors of F,(x) which are similar to a given prime polynomial 
P,(x), we shall call a maximal uniform component of F,(x). It then follows 
that 

Every completely reducible polynomial is uniquely representable as the union 
of maximal uniform components. 

Let finally 


F,(x) = PS’(2)] 


be an arbitrary completely reducible polynomial. If F,(x) is to be divisible by 
any prime polynomial P,(x) different from the basis elements, then at least 
two basis elements must be similar. Any prime divisor of F,(x) has to be 
similar to one of the basis elements, and if Pj” (x) =AP,(x)A-!+¥P,(x) 
we could have constructed the basis such that P,(x) and P,™ (x) were basis 
elements. When conversely an arbitrary polynomial F,(x) is divisible both 
by P,(x) and the similar polynomial P,‘” (x), we see that 


F,(x) = 0, F,(x) X A,(x) =0 (mod P,(x)) 


and from a theorem in §2, it follows that F,(x) is also divisible by all poly- 
nomials BP,(x)B-', where B,(x) is an arbitrary polynomial of the form 


B,(x) = + keA,(x) (hi, ke = 0,1,---,p— 1). 


Since the roots of P,” (x) are different from those of P,(x), it is easily seen 
that BP,(x)B- is different from P,(x) and (x) when and 
This shows that 

The necessary and sufficient condition that a completely reducible polynomial 
be divisible by a prime polynomial different from those occurring in a basis 
representation is that the basis representation contain at least two similar prime 
polynomials. 


* See Ore I, §2, chapter 2. 


578 OYSTEIN ORE. [July 


One can also state this by saying that the basis representation of a com- 
pletely reducible polynomial is unique, when none of the components are similar. 

5. Decomposable and distributive polynomials. In Theorem 4 and The- 
orem 5 we have found two different representations of p-polynomials; several 
others can be found, but only two other representations of importance will 
be mentioned briefly. 

A polynomial is said to be decomposable when there exists a representation 
(16) F,(x) = [A p(x), B,(x)] 


where A,(x) is relatively prime to B,(x); F,(x) is said to be indecomposable 
when no such representation exists. We can prove 

THEOREM 6. Every polynomial can be represented as the union of a number 
of indecomposable polynomials 


(17) F,(x) = [Ay (2),---, Ap (2)], 
where each indecomposable polynomial A,‘ (x) is relatively prime to the union of 


the others; when two or more different representations (17) exist, they will all 
have the same number of components, which will be similar in pairs. 

A polynomial F(x) shall be said to be distributive when there exists a de- 
composition (16), where 4,(x) and B,(x) are proper divisors of F,(x); a cross- 
cut C,(x) of A,(x) and B,(x) may perhaps exist; when no such decomposition 
(16) exists, we shall say that F(x) is non-distributive. 

For the proofs of the following theorems it is necessary to assume that 
K is perfect; one can then state 

THEOREM 7. The necessary and sufficient condition that a polynomial F,,(x) 
be non-distributive is that F(x) have only a single left-hand prime divisor P(x). 


We shall say that the non-distributive polynomial F(x) belongs to P(x). 
It is easily seen that every left-hand divisor of F,(x) is also non-distributive 
and belongs to the same prime polynomial P(x). One can also prove 


THEOREM 8. Let the completely reducible polynomial 
(18) A,(x) [P.(x), P,(x)] 


be the union of all prime polynomials dividing a given polynomial F(x) on the 
left. Then every representation of F(x) as the union of non-distributive compo- 
nents has the form 

(19) F,(x) [C.(x), C,(x)], 


where the non-distributive polynomial C(x) belongs to a prime polynomial 
similar to P(x) (¢=1,2,---,7). 


1933] A SPECIAL CLASS OF POLYNOMIALS 579 


We have supposed that (19) is a shortest representation; i.e., we have omit- 
ted all components which divide the union of the others. 

6. The invariant ring. We shall now define a certain characteristic group 
Gr, the invariant group, and also a characteristic ring Rr, the invariant ring, 
corresponding to an arbitrary p-polynomial F(x). We make the following 
definition : 

The polynomial I,(x) is said to be an invariant transformer of F,(x), when 
I,F (x) is a divisor of F,(x). 

It is easy to determine the invariant transformers in some simple cases. 
Let first F,(x) =x? —ax; it can then be assumed that J,(x) =cx, and from §2 
follows 


I,F,(x)I>) = x? — ac?-'x, 0, 


giving the values c=0 and c?-!=1, i.e., c=0, 1, -- -, p—1. Let next F,(x) 
=x"; using the definition of the transform, one easily finds that every 
polynomial is an invariant transformer. 

The definition of the invariant transformers can easily be modified in the 
following way: 


THEOREM 9. The necessary and sufficient condition that I,(x) be an invariant 
transformer of F ,(x) is that 


(20) F,(x) X I,(*) = 0 (mod F,(x)). 


This condition (20) immediately shows that the sum, difference, and 
product of two invariant transformers is again an invariant transformer, and 
the ring of all invariant transformers is the invariant ring of F,(x). 

When an invariant transformer J,(x) is relatively prime to F,(x) we must 
have 


(21) IpF (x)Ip* = F,(x). 


The invariant transformers satisfying (21) form the invariant group. It is 
obvious that the product of two such polynomials has the same property, 
and to show the group property it only remains to show the existence of an 
inverse. Since I,(x) is relatively prime to F,(x), we can determine an J,{” (x) 
such that 


1 (x) X = x (mod F ,(x)), 


and it is easily seen that also J, (x) satisfies (21). 
Let now a be a root of 


(22) F,(x) = 0- 


{ 


580 OYSTEIN ORE [July 


from (20) follows that J,(a) is also a root of (22) for an arbitrary root a and 
an arbitrary invariant transformer J,(x). The invariant transformer there- 
fore permutes the roots of (22), or, expressed in a different way, it transforms 
the modulus formed by the roots of (22) into itself or a submodulus. When all 
the roots of (22) are different, the invariant transformer J,(x) is uniquely 
determined by the transformation it produces, since I,(a) =I,‘ (a) for all a 
implies J,(x)=I, (x) (mod F,(x)). Since the number of roots of (22) is 
finite we obtain 


THEOREM 10. When all the roots of F,(x) =0 are different, the invariant ring 
and the invariant group are finite. 


When F,,(x) =0 has equal roots, then 


F,(x) = «°G,(x), 


and the invariant ring of F,(x) will be identical with the invariant ring of 
G,(x), when considered (mod G,(x)). Incidentally, these remarks also show 
that the polynomials cx?’ are the only ones for which all the polynomials are 


invariant transformers. 
From the fact that the invariant ring is finite follows that it is an algebra 


over the finite field (mod #) and the invariant ring has a basis, such that every 
element can be represented in the form 


I p(x) = cl +--+ + (x) (mod F,(z)) 


where c;=0, 1,--+, p—1. The invariant ring defined here should more 
specifically be called the right-hand invariant ring. There also exists a left- 
hand invariant ring having similar properties; for a left-hand invariant trans- 
former J,(x) one must have as in (20) 


(23) J X = F(x) X I(x), 
and here J,(x) must be a right-hand invariant transformer according to defi- 
nition. When conversely J,(x) is an invariant right-hand transformer it is 
easily seen that 
(24) J p(x) = Fy(x) X X 
is a left-hand invariant transformer of F,(x). 

THEOREM 11. The left-hand and right-hand invariant rings and groups are 
directly isomorphic through the correspondence (24). 


Let us finally determine the invariant ring of a prime polynomial P,(z). 
In this case every [,(x)#0 (mod F,(x)) has an inverse, and the invariant 


1933] A SPECIAL CLASS OF POLYNOMIALS 581 


ring is a field. Since this field has a finite number of elements, it follows from 
a theorem of Wedderburn that it is commutative. 


THEOREM 12. The invariant ring of a prime polynomial P(x) is a commuta- 
tive, finite field. 


The invariant ring of a p-polynomial is closely connected with the struc- 
ture and representations of the given polynomial and several interesting re- 
sults can be obtained. It will however carry us too far to study these problems 
here. 


CHAPTER 3. CONNECTION BETWEEN p-POLYNOMIALS 
AND ORDINARY POLYNOMIALS 


1. Polynomials belonging to a p-polynomial. We shall finally study some 
of the connections between p-polynomials and ordinary polynomials in K. 
First of all we shall show that an arbitrary polynomial f(x) of mth degree 
always divides a p-polynomial. Let us divide all pth powers of x by f(x); 
this gives relations of the form 


(1) (mod f(x)) = 0,1,2,---). 
The powers 1, x, x?, - - - on the right-hand side of the v <n first congruences 
(1) can now be eliminated, and on the left-hand side this gives a p-polynomial 
F(x) with the exponent » which is divisible by f(x). Since F,(x) obviously is 
the ~-polynomial with the smallest exponent having this property, it follows 
from Theorem 2, chapter 1, that every other p-polynomial ¢,(x) having the 
same property must be symbolically divisible by F(z). 

THEOREM 1. Every polynomial f(x) of degree n belongs to a unique, reduced 
p-polynomial F(x) with exponent v <n, such that f(x) divides F(x) and every 
other p-polynomial $,(x) divisible by f(x) is symbolically divisible by F(x). 

The number » shall be called the exponent of f(x). It is easily seen that one 


can determine F(x), when the p-polynomials corresponding to the irreducible 
factors of f(x) are known. Let namely 


(2) S(%) = + + 


be the prime-function decomposition of f(x); we denote by g(x) the product 
of all different prime factors of f(x): 


(3) g(x) = oi(x) +++ d(x). 
When g(x) belongs to G,(x), then F,(x) must be symbolically divisible by 


582 OYSTEIN ORE [July 


G,(x), and since F,(x) cannot contain equal factors, except when the last 
coefficient vanishes, it follows that F(x) has the form 


F,(x) = XG,(x), 


where / is the smallest exponent such that p‘ exceeds all e; in (2). 

One can consequently assume that the polynomial to be considered has 
no equal factors and therefore is of the form (3). One finds, that when the 
irreducible factor ¢;(x) belongs to ¢,‘° (x), then g(x) belongs to the union 


G,(x) = 


2. The degrees of the factors. When the roots of the polynomial f(x) are 
known, the corresponding p-polynomial F,(x) can be determined in a 
different way. Let 
(4) f(x) (x 61) (x On), 
and let us assume that all roots are different and non-vanishing. In the field 
- --, 0,) a linear factor x— 0; belongs to x” —6,?-'x, and from the last 
remarks of §1 we obtain 

THEOREM 2. Let the n different non-vanishing numbers 


(5) A, 62, On 
be the roots of a polynomial f(x) in K; then f(x) belongs to 


It is obvious that the coefficients of F,(x) belong to K, since they are 
symmetric functions of the elements (5). 

It should be noted that there are always polynomials belonging to an 
arbitrary p-polynomial F,(x), for instance F,(x). There are however not 
always irreducible polynomials belonging to a given /-polynomial, and 
consequently there exist p-polynomials without primitive roots, i.e., such 
that every root of F,(x) =0 satisfies a p-equation with lower exponent. As an 
example let us take 

F,(x) = [x? — ax, x? — ab?-1z]. 


F,,(x) is the union of two similar p-polynomials with the exponent 1, and its 
roots are of the form 


6= + (Ai, ke = 0, 1, 1), 


and @ satisfies the equation with exponent 1 
x? — (ky + keb)? ax = 0. 


1933] A SPECIAL CLASS OF POLYNOMIALS 583 


It would be an interesting problem to determine the necessary and sufficient 
condition for the existence of primitive roots. 

Let us now suppose that the p-polynomial F(x) is generated by an ordi- 
nary polynomial f(x) with the roots (5) as indicated in Theorem 2. The roots 
of F,(x) are then according to (6) 


(7) M, = t+---+ (ki =0,1,---,p—1;4=1,---, 7). 


All factors g(x) of F,(x) have therefore roots lying in the Galois field K(0:, 
- ++, 60,) and if N is the degree of this Galois field, it follows that the degree 
of each factor is a divisor of V. This gives in particular 


THEOREM 3. When F,(x) is generated by an irreducible Galois polynomial 
f(x) of degree N, then all factors of F,(x) have degrees equal to N or a factor of N. 


It is possible that even for an arbitrary p-polynomial F,(x) the theorem 
holds that if NV is the degree of the maximal factor of F,(x), then all other 
factors have degrees equal to N or a factor of N. I have only been able to 
prove this theorem under certain limiting conditions. It should be observed 
that Theorem 2 gives a generalization of a well known property of the poly- 
nomial —x (mod 

3. The Galois group. Let F,(x) be a p-polynomial and f(x) a polynomial 
belonging to F,(x); when the roots of f(x) are given by (5), then the roots of 
F(x) form the modulus (7). The following is therefore obvious: 


THEOREM 4. The exponent of f(x) is equal to the rank of the modulus (7). 


Choosing the notation in a suitable manner, one can write the modulus 
(7) in the reduced form 


(8) 


The equations F(x) =0 and f(x) =0 define the same Galois field, as one sees 
from the representation (8) of the roots. Let G be the Galois group of f(x); 
any permutation S in G will then produce a substitution on the linear expres- 
sions (8), and it is easily seen that two different permutations will produce 
different substitutions. This shows 


THEOREM 5. When v is the exponent of the polynomial f(x), then there exists 
a true representation of the Galois group G of f(x) by means of matrices of rank v 
in the finite field (mod ?). 

We have in the introduction mentioned the analogy between /-poly- 
nomials and differential polynomials. To those who are familiar with the 
Picard-Vessiot theory of linear homogeneous differential equations, it will be 
clear that the group of linear substitutions on the expressions (8) correspond- 


584 OYSTEIN ORE 


ing to the Galois group G is the analogue of the group of rationality of a 
differential equation. One may of course obtain a different representation of 
G by using a different basis for the roots of F,(x), but it is easily seen that all 
such representations are similar. 

Almost all theorems on the group of rationality have analogues in the 
theory of p-polynomials. I shall here only mention two results, analogous to 
theorems by Loewy on differential equations: 


THEOREM 6. The necessary and sufficient condition that a p-polynomial be — 
reducible in K is that the representation of G be reducible. 


When the representation of G is reducible, one can choose a basis for the 
modulus of the roots, such that there exists a submodulus G’ which is trans- 
formed into itself by all substitutions of G. The submodulus G’ defines a factor 
G,(x) of F,(x) and since G,(x) is left unchanged by all substitutions in G 
it has coefficients in K. When conversely F,(x) has a symbolic factor Q,(x) 
it is clear that a reducible representation of G exists. In a similar way we show 


THEOREM 7. When F,(x) is decomposable, 
F,(x) = [A ,(z), B,(x)], 


then the representation of G is also decomposable and equal to the sum of two 
representations corresponding to A,(x) and B,(x), and conversely. 


YALE UNIVERSITY, 
New Haven, Conn. 


THE DEGREE AND CLASS OF MULTIPLY 
TRANSITIVE GROUPS, III* 


BY 
W. A. MANNING 


If a group of substitutions of class « (>3) is more than triply transitive, 
its degree does not exceed 2u+1. This is Bochert’s Theoremj (reduced one 
unit) and was the most that could be said in an entirely general way up to the 
present about the degree of highly transitive groups of given class. It will be 
proved in this paper that a ¢-ply transitive group of class u(>3) is of degree 
n<6u/5+u/t—t if t>23. It will also be shown that if t>4, n<2u; if t>5, 
n<5u/3;if t>7,n<3u/2;if t>11,n<4u/3; if t>21,n<5u/4. 

1. On page 648 of DC2 it is proved that for 4-ply transitive groups 
n <2u+1. The method there used is now extended to 5-ply transitive groups 
in the proof of the following theorem. 


THEOREM I. If nis the degree and u the class of a 5-ply transitive group, not 
alternating or symmetric, n S2u. 


If there is a substitution of order 2 and degree u in the group G, »<5u/3 
<2u—1 (u26; DC1, p. 463). Then unless all the substitutions of degree u 
are of order 3, at least one of them is of order >4. Let us say that 


S = (abcde-+-)-+--, 


and let Si, S2, - - - , S. be similar to S and a complete set of conjugates under 
H, the subgroup of G that fixes a, c, and f, where f is a letter of S not adjacent 
to a orcin S. Since G is triply transitive we can make 


Si = 


The condition on the letter f can be satisfied if S is of order >5 by a letter of 
the first cycle of S, while if S is of order 5 there will be in S a second cycle 
from which it can be taken, because u>5. 

We now have, quite as on page 648 of DC2, 


* Presented to the Society, August 30, 1932, and June 19, 1933; received by the editors October 
3, 1932, and (revised and extended) February 10, 1933. 

¢ Manning, these Transactions, vol. 31 (1929), p. 648. This paper will be referred to as DC2, 
and the first paper bearing the same title and which appeared in these Transactions, vol. 18 (1917), 
p. 463, will be called DC1. 


585 


4 


W. A. MANNING 


3w(u—1)(u— 3) 2w(u—3) — 2)(u — 3)(u — 4) 
n— 3 7 n— 3 7 (n — 3)(n — 4) 
2w(u—1) — — 2)(u — 6) 


(1) 


Writing x for n—4 and k for u—3, this is 
(2) kx? — (3k? +k — + 2k? — 8k 50. 
If x=2k+1, (2) becomes —k?—2. If x=2k+2, it becomes 2k+2. Therefore, 
unless all the substitutions of degree u of G are of order 3, n <2u. 
If all the substitutions of degree u are of order 3, S=(abc)(def) - - -and 
Si=(b)(d) (a ---)---+.Ason page 649 of DC2 we set up 
3w(u 3) 2w(u— 3) — 2)(u — 3)(u — 4) 
n — 3 n— 3 (n — 3)(n — 4) 
wiu—1) — 1)(u — 2)(u — 5) 2w(n — 
ome = wu 
n— 3 (nm — 3)(m — 4) (n — 3)(m — 4) 


3w+ 
(3) 


With the same x and as before, this is 
(4) ka? — (3k? + 2k — 4)x + 7k 


If x =2k+3, the left member is 44+10>0, while if x =2k+2, it is—k?+k+6, 
so that x <2k+3, or finally » <2u, and our theorem is proved. 


2. We take up next 


THEOREM II. If m is the degree and u (>3) the class of a 6-ply transitive 
group, n<5u/3. 


In the proof of I the well known fact that there is no 5-ply transitive group 
of class >3 and <8 was used. That there is no 6-ply transitive group of 
degree <53 is an immediate consequence of the following two theorems: 

A. If a primitive group of class >3 contains a circular substitution of prime 
order p (>3),nSp+2.* 

B. Let q be an integer =2 and <5; p any prime >q+1; then the degree of 
a primitive group which contains a substitution of order p that displaces pq 
letters (not including the alternating group) is <pq+q.t 

For example, were there such a group of degree 25, with its order neces- 
sarily a multiple of 25-24-23-22-21-20, it would contain a substitution of 
order 11 of degree 11 or 22. The first is impossible by A and the second by B. 

By I, the 6-ply transitive group G, if of class £26, is of degree £52, an 

* C. Jordan, Bulletin de la Société Mathématique de France, vol. 1 (1873), p. 40. 

+ Manning, these Transactions, vol. 15 (1909), p. 247. 


586 [July 
3wt+ 
= wn, 


1933] MULTIPLY TRANSITIVE GROUPS, III 587 


impossibility by A and B, so that only groups of class >26 need be con- 
sidered. 

The theorem is known to be true if one of the substitutions of degree u is 
of even order (DC1). 

According to Dr. Luther the 6-ply transitive groups of class u(>3) in 
which there are substitutions of degree u+e of even order (0<e<u/9) have 


4u 
(5) n< 7. + 4e.* 


However, it may be that all the substitutions of degree Su+e are of odd 
order. Suppose that to be the case. 

Let S be a substitution of G of degree u, and of order >3. Then its order, 
being an odd number, is 5: 

S = (abcde ---j)---(a)- 
Among the conjugates of S under G there is a substitution 
= 

and the complete set of conjugates under H (the subgroup of G that fixes the 
four letters a, b, c, and a) is S:, S2, - - - , Sw. Reasoning as in the proof of I, 


n—4 
wiu— 3) — 3)\(u— 4)? 


(n — — 5) 


(7) = = wt 


Now 
= (ca)---, 


a substitution of even order, so that 

3w(u — 3)? . 2w(u— 3) 22w(u — 3)(u — 4)? 
n— 4 n—A4 (n — — 5) 

= w(u+te+i1). 


6w + 


(8) 


Here we replace n—5 by x and u—3 by k, and have 

(9) (e + k)x? + (e — 3k? + 3k) x + 2k(k — 1)? 50. 

If 2k —4e—2 and 2k —4e—1 are put for x in (9) we obtain —2ke+16e*+ 12e? 
+2e and (k—3e)?+16¢*—5e?, respectively. While the first of these numbers 


* C. F. Luther, American Journal of Mathematics, vol. 55 (1933), p. 77. 


| 


588 W. A. MANNING [July 


may be negative for large values of k, the second is always positive. 
Therefore x <2k—4e—1, that is, 


(10) nS 2u — 4e — 3. 
If S is of order 3, 
S = (abc)(def)--+(a)---, 


= (aba) 
and 
w(u — 3)? 


(11) = 2w + 


as before. But 


w(u — 3)*(u — 4) 
(12) La = (n — 4)(n — 5) . 


The inequality 

3w(u — 3)? 2w(u — 3)2(u — 4) 

with x for n—5 and k for u—3, reduces to 


(14) (e + k)x? + (e — 3k? + k)x + 2k§ — 2k? 5 0. 


= w(u +e+ 1), 


(13) 6w+ 


If 2k—4e and 2k—4e+1 are put for x in (14) we get —2ke+16e* —4e? and 
(k—3e+1)?+16¢e?—21e?+8e—1, respectively. Hence x<2k—4e+1, or 
n<2u—4e, or 

(15) nS 2u—4e—1. 

Dr. Luther’s limit (which for e=0 is that of DC1), 4u/3+4e, increases 
with e, while (15) decreases with e. It is to be proved from (5) and (15) that 
n <5u/3, irrespective of the presence or absence of substitutions of order 2. 
Let E be the integral part of u/12—1/8, which is the solution for e of the 
equation 


4u 


Now either all the substitutions of degree <u+£-+1 of G are of odd order, 
or one of the substitutions of degree <u+E-+1 is of order 2. Then if we put 
E-+1 for e in (5), we have a valid upper limit for the degree of 6-ply transi- 
tive groups of class u( >3). Therefore 


E 


MULTIPLY TRANSITIVE GROUPS, III 


(17) 


This is not what we set out to prove, but before proceeding farther, let us 
make use of it to revise the lower limit, « >26, of the class of 6-ply transitive 
groups under which we have been working. From n>52, 5u/3+7/2>52, 
and therefore u = 30. 

To find the limit stated in our theorem, we return to the two cases: 
(1) At least one substitution of degree u is of order >3. (2) All substitutions 
of degree u are of order 3. 

(1) In this case m S$ 2u—4e—3, and the solution of 


4u 


is e=u/12—3/8. This number is not an integer. Let E be its integral part. 
Since E+1<wu/9 for u=30, formula (5) holds good for E and for E+1. When 
there are substitutions of order 2 of degree <u+E in G, n<4u/3+4E, but 
if all the substitutions of degree <u+E are of odd order, nS2u—4E—3, 
and of the two, the latter gives the higher limit and should be retained. But 
we saw that 4u/3+4E+4 was a true limit also. Then of these two formulas, 
2u—4E—3 and 4u/3+4E-+4, we are at liberty to choose the lower. 

If uw=r, mod 12 (r=5, 6,--- , 16), E=(u—r)/12, and the two formulas 
between which we may choose are 5u/3+7r/3—3 and 5u/3+4-—1/3. One or 
the other gives <5u/3 unless r= 10 or 11. If r=11, n<5u/3+1/3 is equiva- 
lent to m <5u/3—1/3. Now Dr. Luther’s limit, so concisely stated, is deduced 
from the inequality 


4(u +e — 4)? 
3(u — — 4) 


(19) n<5+ 


When r =10, and therefore E =(u—10)/12, we substitute E+1=u/12+1/6 
for e in (19) and have on reduction 


(20) Sut 40.494 18.68 
© 
3 102 17u — 74 


If u=58, n<5u/3—0.04; if u=46, n<5u/3+0.07; if u=34, n<5u/3+0.21. 


1933] 589 
4u 
n<—+4E+4 
<S+4(- 4 
3 12° 8 
= 
3 


590 W. A. MANNING [July 


Since «= 30, and since our formula (10), for e=E+1=u/12+1/6, becomes 
5u/3—11/3, we conclude that »<5u/3 when r=10. 

(2) Every substitution of degree is of order 3, and u=0, mod 3. Equat- 
ing the right hand members of the two formulas 


(15) mS 2u—4e-—1 


and 


4u 
(21) 


we find e=u/12. If u/12 is a whole number, (15) becomes »<5u/3—1. 
Let u=r, mod 12 (r=3, 6, or 9); E=(u—r)/12. The condition E+1<u/9 
is satisfied. We are at liberty to choose between n<5u/3+r/3—1 and 
n<5u/3+3-—r/3. For r=3 and for r=9, n<5u/3. For r=6 both formulas 
are n<5u/3+1. Again we have recourse to Dr. Luther’s original formula (19) 
and in it put e=£+1=wu/12+1/2. It reduces to 


5u u 51.89 


22 


When u=102, n<5u/3+0.94, and because the sum of the last three terms 
of (22) decreases as u increases, n<5u/3 for 12102. For u=90, 78, 66, 54, 
and 42, 5u/3+1=151, 131, 111, 91, and 71, respectively. But with the aid 
of Theorems A and B it is easy to show that there are no 6-ply transitive 
groups of these degrees and of class >3. Therefore »<5u/3, for all non- 
alternating 6-ply transitive groups. 

3. We now undertake to prove the following fundamental theorem: 

TuHeorem III. If n is the degree and u (>3) is the class of a t-ply transitive 
group (t>6) in which all the substitutions of degree Su-+e are of odd order, 
n <2u—4e—5t+37. 

Here e=0, 1, - - - . There is a substitution S of order >3 among the sub- 
stitutions of degree u of G. Because G is of class u, S is a regular substitution: 


S = (ab+++) +++ 


and a, b,- - - , 7 are the first ‘—4 letters of S. Since G is of sufficiently high 
transitivity, it contains a substitution (ka) - - - (a)(b) - - - (7) - - - which 
transforms S into 


Sy = (ab---) +++ ++ 


1933] MULTIPLY TRANSITIVE GROUPS, III 591 


The indicated order of the letters of S and S; is to be maintained pales 

It may be that ¢ and the order of S are so related numerically that S,;= 

about the order of S-1S¢ 1§S,; but if S,; has two or more of the letters a, }, 
- ,j preceding a in its cycle of S,, 


= (ka) - 


and S-!S;1SS; fixes all the ¢—4 letters a, b, - - - , 7 except perhaps the first 
two of the cycle of S; in which @ occurs. 

Let us assume for the moment that ¢ is such a number that S,= - - - 
There is a doubly tr transitive subgroup H of G that on the :—2 ieee a, 
b,---,j, k, and a. Its degree is »—t+2. Under H, S, is one of a complete set 
of w conjugate substitutions, S,, S:, ---, So. Let S; have m; letters of H in 
common with S. Of these m; common letters (of H), let S replace g; by com- 
mon letters (of H) and let S; replace r; by common letters (of H). Then the 
degree of S-*S71SS; does not exceed 4+3m;—q:—r;. For this substitution 
displaces at most 4 of the ¢—2 letters fixed by H: k, a, and the first two letters 
of the cycle of S; in which a occurs. It may displace all the other common 
letters, m; in number. Of the letters of S; that are new to S and are letters of 
H, only those m;—r; that follow common letters are displaced by S7!SS; and 
therefore by S-!S74SS;. And a like statement holds for S-1S7'S. 

For use in the succeeding paragraphs we not that S displaces u—t+3 
letters of H, and has (u—#+3)(u—t+2) ordered pairs of letters of H; S has 
u—t+3 sequences in letters both of which are displaced by H if k ends a 
cycle of S, but only u —¢+2 if k does not end its cycle. 

Now S; displaces u—t+3 letters of H. The complete set Si, S2,---, Sy 
displaces w(u—i+3) letters of H, one as often as any other because 7 is 
transitive. Therefore 


w(u — t + 3)? 
nm—t+2 


(23) Ym; = 


In S, there are u—t+3 sequences of letters of H if Si=(ef--- ja) - 
But if a is not the last letter of its cycle, there are «—i+2 such sequences. 
Then the total number of these sequences in the set is w(u—t+3) or w(u—t 
+2) and each, because H is doubly transitive, occurs w(u—t+3)/[(m—t+2) 
-(n—t+1)] times, or w(u—t+2)/[(n—t+2)(n—t+1)] times, respectively. 
Hence 

ous 2 = 
(24) w(u — t + 3)*(u — t+ 2) 


(n —t+2)(n 1) 


W. A. MANNING 


or 
w(u — t+ 2)%(u — t+ 3) 
(n—t+2)(n—t+1) 


the first, and larger, value of >°g; holding only when Si= - - - (ef - - - ija) 
-++, There are (u—t+3)(u—t+2) ordered pairs of letters of H in S; in 
both cases. Then 


(26) Dri = 


(25) Da = 


w(u — t+ 3)*(u — ¢ + 2) 
(n 2)(n —¢ +1) 


or 


(27) = 


w(u — t+ 3)(u — t+ 2)? 
(n—t+2)(n—t+1)_ 


in the two cases respectively. 
The substitution S-'S71SS,; is of even order and therefore is by hypo- 
thesis of degree >u+e. Then 
(28) 4w+ — 7) = w(ute+ 1), 
or, if to }>g; and >>r; are given their smaller values, 
3(u—t+3)2? Au—t+ 3)(u—t¢+2)? 
29 4 _ = 1. 


If Siz --+(ja---)-+-(k)-+-, (Ga---) is not the first cycle of Si, 
while there certainly is in G a substitution 7, a transform of S by (a)(6) - - - 
(h) (ic) (78) (ky) - - - , where and ¥ are two letters fixed by S: 


Ti = +++ (ef--+gha)--- 
The substitution 


= (ai)(ef)---. 


T; is one of w (probably numerically different from the former w) conjugates 
under H. There are u—i+5 letters of H in 7;; in the set 71, T2,---, Tu 
each occurs w(u—t+5)/(m—t+2) times. Then 
w(u — t+ 5)(u — t+ 3) 

30 
am n—t+2 
In 7; there are u—t+5 sequences with both letters displaced by H. Therefore 
w(u — t+ 5)(u — t+ 3)(u — t+ 2) 

(n —t+2)(n—t+1) 


(31) Da = 


592 [July 
| 


1933] MULTIPLY TRANSITIVE GROUPS, III 


Also 


w(u — t+ 5)(u— t+ 4)(u— t+ 
(n — t+ 2)(n —t+ 1) 


(32) = 


Combining, 
3(u—t+5)\(u—t+3) (w—t#+5)(u—t+ 3)\(u — t+ 2) 
s—t+2 (n —t+2)(n—t+1) 
(u—t+5)(u—t+4)(u — 2) 
7 (n —t+2)(n—#+1) 
In particular, this inequality (33) arises when ¢=8 and S is of order 3. 


There remains the case in which Si= -- - (a@---) +++, and in which 
we actually use the transform of by (a)(b) - - - (4)(ja)(kB) ---: 


Ti = +++ (+++ hia) +++ 


We say that 7; is one of a complete set of w conjugates under H. T; displaces 
u—t+4 letters of H, and therefore 


(34) w(u — t+ 4)(u + 3) 


nm—t+2 
In T; there are u—¢+4 sequences in letters of H. Therefore 


Wu t+ t+ 3)(u — 2) 
(35) (n —t+2)(n —t +1) 
Hence 


4 
(33) 


Z2uteti. 


+ 


(36) 
Au —t+4)(u—t + 3)(u + 2) 


(mn —¢-+ 2)(n —¢+ 1) 
In the three inequalities (29), (33), and (36), we put x=n—/+1, k=u—-3, 
and s=t—7, and simplify. They become, in order, 
(37) (e + k)a? + [e — 3k? + (6s + 7)k — 3s? — 6s — 3]ax + 2k? — (65 + 10)k? 
+ (6s? + 20s + 16)k — 2s? — 10s? — 16s —8 $0, 
(38) (e + k)x? + [e — 3k? + (6s + 1)k — 3s? + 3] x + — (6s + 
+ (6s? + 6s — 3)k — 2s? — 357+ 35 +250, 
(39) (e + k)x? + [e — 3k? + (6s + 4)k — 3s? — 3s]a + — (65 + 
+ (6s? + 12s + 4)k — 2s — 6s? — 4s SO. 


Z2utet+i. 


593 


594 W. A. MANNING 


In (37) we put x =2k —4e —6s —3, and write the result: 


(k — 7e — 8s — 1)? + 16e* + (48s — 29)e? 


(40) 
+ (48s? — 58s + 4)e + 16s? — 29s? + 4s. 


This is clearly positive for s=>2. If s=1, it is positive if e=1, and when e=0 
it reduces to (u—12)?—9, which is positive for =8 and w230. A similar 
detailed examination shows it to be positive for all s20 and e=0. When 
x =2k —4e—6s —4, (37) becomes 

— 10ke — 10sk — 2k + 16e* + (48s + 28)e? 

+ (48s? + 66s + 24)e + 16s? + 38s? + 26s + 4, 
which is negative for some sets of values of wu, ¢, and e. Therefore x <2k—4e 
—6s—3, or 
(42) nm < 2u— 4e— Si+32 (¢> 6). 


We next put x =2k —4e—6s+2 in (38) and find 


(41) 


9\2 
(43) (» — Se — 6s + 5) + 16e* + (48s — 45)e? 


49 
+ (48s? — 90s + 39)e + 16s* — 45s? + 39s — —" 


This is positive for s=2, in fact, is positive for all s20, e=0. The value of 
(38) when x =2k —4e—6s+1 is 


(44) — 6ke — 6sk + Sk + 16e* + (48s — 12)e? 


+ (48s? — 18s — 10)e + 16s* — 6s? — 155+ 5. 
Therefore 


(45) n<2u—4e—5t+37 (¢>6). 


Finally, we seek the limit given by (39). Put x=2k—4e—6s—1. Then the 
left member of (39) becomes 


(46) (k — Se — 6s)? + k + 166% + (485 — 21)e? + (48s? — 42s)e + 16s — 21s? s, 


and this too is positive for s=0; e2=0. The result of substituting 2k —4e—6s 
—2 for x in (39) is 


(47) —6ke—6sk — k+16¢?+ (48s + 12)e?+ (48s? + 305+ 2)e-+ 16s*+18s? + 2s. 


The limit on , deduced from (39), is 
(48) n<2u—4e—St+34 (t>6). 


[July 


1933] MULTIPLY TRANSITIVE GROUPS, III 595 


Of the three results (42), (45), and (48), (45) is to be used when ¢>7. If 
t=7, only (37) and (39) are applicable, the latter when S is of order 3, and 
therefore the limit for ¢=7 isn<2u—4e—1. 

4. From this point on only groups more than 7-ply transitive will be in 
question. If Theorems A, B, and II are applied to 8-ply transitive groups, 
one quickly finds that m > 158, and u>94. We are now to prove 


THEOREM IV. The degree of an 8-ply transitive group of class u(>3) is less 
than 3u/2. 


Dr. Luther* has proved a remarkable theorem which may be stated as 
follows: 

C. Let G be a more than 2°+pitpo.t+ --- +>), times transitive group 
(a=2; pi, po, - ++, pr distinct odd primes) of class u(>3); if G contains a 
substitution of even order of degree u+e, 

(49) <ut 
nN 

If G is 8-ply transitive and includes a substitution of order 2 and degree 

<u-+e, this theorem asserts that 


6u 
(50) 


without any restriction upon e. Theorem III states that if G contains no 
substitution of even order of degree <u-+e, 


(51) n < 2u — 4e — 3. 


Equate the right hand members of these two inequalities and solve for e: 
(S2) 


We have a true upper limit for the degree of all 8-ply transitive groups that 
are not alternating or symmetric if we put 2u/15+1/3 for ein (50). Thus 


(53) 


* American Journal of Mathematics, vol. 50 (1933). 


3 
3u  u— 50 
30 
3u 
| <—. 
2 } 
i 


596 W. A. MANNING 


The last step follows because u>94. 
5. The next theorem can be disposed of very briefly. 


THEOREM V. If a group of class u(>3) is more than 11-ply transitive, its 
degree is less than 4u/3. 


Let a=3,r=1, and ~,=3 in (49). Then, if there are substitutions of order 
2 and of degree <u+einG, 


(54) 


Also, for 4=12, III becomes, all the substitutions of degree <u+e being of 
odd order, 
(55) n < 2u — 4e — 23. 
If 
12u 4e 


56 — +— +1 = 2u — 4e — 23, 
(56) 


e=15u/88—9/2. Then, as before, when we put 15u/88—7/2 for e in (54), 


(57) 


6. It is possible to go one step farther in the elaboration of these ex- 
tremely concise limit formulas. 


THEOREM VI. The degree of a t-ply (t>21) transitive group of class u(>3) 
is less than 5u/4—t. 


From C, granting its hypothesis, and putting a =4, p, =5, it follows that 


(58) 


By III, if its hypothesis is granted, 
(45) n < 2u — 4e — 5¢t+ 37. 


Equating, solving for e, and proceeding as before, 


[July 
n<—+— 
= 11 3 
29u 
n<——— 
22 3 
4u 
<—- 
3 
| 


MULTIPLY TRANSITIVE GROUPS, III 


436u 10¢ 71 


From (59) it is clear than n<5u/4 (¢>21), so that if 1<1295, n<1619. 
Since A, B, and a short list of prime numbers tell us that there is no non- 
alternating 22-ply transitive group of degree <1619, 


(60) 


(61) 


7. In what follows it is of advantage to know that the class of a 24-ply 
transitive group exceeds 7600. This is a consequence of VI if » exceeds 
9500, which fact can easily be verified by means of A and B and a list of 
primes. 

THEOREM VII. Let n be the degree of a t-ply (t>23) transitive group of class 
u (>3); then 


(62) 


Let s=fitfot+ --- +), be the sum of r distinct odd primes, given in 
advance, and let p be their product. G is a ¢-ply transitive group, and ¢ is 
large. Now let a be the largest integer such that 

22°<t—s 
The solution for e of 


(63) 
=> 


where (49) has been set equal to (45), is 


_ (28 2)(2% — Ayu (28 2)(5t — 36) 
(6-28 — 8)(2"p — 2) 5-2° — 8 


(64) 


Then, on the insertion of e+1 in (49), 


1933] 597 
59 n<— 
(59) 351 9 + 7 
5u 71 
<—-—--_4— 
4 9 1404 7 
5u 108 11 
<— — — — (u — 1295). 
4 9 1404 
= én 10¢ 
nN — 
4 9 
= 
= eval 
n<—+—-—t. 

5 t 
| 

| 
| 


598 W. A. MANNING 


— 8p — 4)u 2°(5t — 36) 
(5-2e—8)(2—2) §-2¢—8 


(65) 
6u hu ; 
(66) 


where 


(8p + 
5(5-2°» — 8p — 10) 


(67) h = 


and 
46-2% — 8 — 16 2 
5-22 — 8 2¢— 2 


Now let =35, s=12. In this case, because ¢ exceeds 12+2%, 7<38/5; 
and because of 


(68) 


64t 
5(35-2¢ — 58) 
128(2* + 6) 
~ §(35-22 — 58) 
If a=7, we conclude from (70) that h<4/5. Then for groups that are 
more than 140 times transitive, 


(69) 


(70) 


(71) 38) 
n<—+—-—t+—|— — 38). 


Dr. Luther has proved in a simple way that for non-alternating ¢-ply 
transitive groups 


— 2% 


(72) n= 


Now for ¢>21, we know that n <5u/4—t, or u>4(n+#)/5, by VI. Therefore 


+ 2¢t 
(73) u> 


This inequality (73) holds for all primitive non-alternating #-ply transitive 
groups, as can be easily seen by examining it for#=1, 2, - - - , 21. 

In (71), u/t>38 if (¢+2)/5238, that is, if #2188. We have therefore 
proved (62) in case ¢>187. 

Let (66) be written thus: 


[July 


1933] MULTIPLY TRANSITIVE GROUPS, III 
(74) 


where | =jt/(1—h). It is clear that (62) is true when /<7600. Now let 

140 187, p= 35;h <0.6, <7.6,1 < 3560; 
95 <#< 140, p = 1001; h < 0.8, j <7.0, 1 < 4900; 
16<t< 95,p= 35;h<0.6, 7 <7.6,1 < 1820; 
44<tS 16,p= 35; h <0.92, <7.6, 1 < 7250; 
37 <tS 44,p= 5; <0.94, <7.7, 1 < 5650; 
30<t< 37,p= 33; < 0.96, j < 6.7, < 6200; 
24 <t< 30,p= 15; h < 0.90, <7.4, 1 < 2230; 

t= 24,p= 7;h<0.94, 7 <7.5,1 < 3000. 


It is proved that n <6u/5+u/t—tif t>23 and u>3. 


STANFORD UNIVERSITY, 
Pato Atto, CALir. 


| 


THE ARITHMETICAL THEORY OF LINEAR 
RECURRING SERIES* 


BY 
MORGAN WARD 


I. INTRODUCTION. THE DIFFERENCE EQUATION OF ORDER ONE 
1. Let m be an integer greater than one, and let 
(u): 
be an arithmetical series{ of order k; that is, a particular solution of the linear 
difference equation 


where Cz, - - , ¢, and the initial values mo, - - - , of (u) are given 
integers. Then if a, is the least positive residue of u, modulo m, we may as- 
sociate with (w) a second sequence 


(a): Go, 41, 


which we call the reduced sequence corresponding to (w) modulo m. 
It is easily seen that after a finite number of terms, the sequence (a) 


repeats itself periodically, and that any one of its periods is a multiple of a 
certain least period which is called the characteristic number of (u) (or (a)) 
modulo mf. The number of non-repeating terms in (a) is called the numeric 
of (uw) modulo m; if it is zero, (u) is said to be purely periodic§ modulo m. If 
all the terms of (u) after a certain point are divisible by m, so that the re- 
peating part of (a) consists of the single residue zero, (u) is said to be a null 
sequence modulo m. 

Three important problems immediately suggest themselves: first, to 
determine the characteristic number and numeric of the sequence (mu) as 


* Presented to the Society, August 31, 1932; received by the editors September 6, 1932. 

+ The literature prior to 1917 is summarized in Dickson’s History, vol. I, chapter XVII. Among 
the more recent papers, D. H. Lehmer, Annals of Mathematics, (2), vol. 31 (1930), pp. 419-449, 
treats the case k=2, and the author, these Transactions, vol. 33 (1931), pp. 153-165, the case k=3. 
For general k, see R. D. Carmichael, Quarterly Journal of Mathematics, vol. 48 (1920), pp. 343-372. 
Certain of Carmichael’s results were extended by the use of ideals by H. T. Engstrom, these Trans- 
actions, vol. 33 (1931), pp. 210-218. I shall refer to these papers by the authors’ name and page num- 
ber. For the bearing of the problem upon elementary number theory, see R. D. Carmichael, American 
Mathematical Monthly, vol. 36 (1929), pp. 132-143. 

t This term is due to Carmichael, p. 345. 

§ This is always the case if m is prime to c, in (1.1). 


600 


LINEAR RECURRING SERIES 601 


functions of the 2k+1 integers ¢1, - - , Cx, Uo, , and m*; secondly, 
given (1.1) and m, to determine least upper bounds for the characteristic 
number and numeric of any solution of (1.1); and thirdly, given m and k, to 
determine the least upper bounds for the characteristic number and numeric 
of any arithmetical series of order k. The bearing of these problems upon the 
arithmetical properties of such series is evident; nevertheless none of them 
has as yet been completely solved.t 

2. The course of the investigation may best be explained by considering 
the special case of a difference equation of order one, 


(2.1) = 


Any solution (x) of (2.1) is of the form 


Un = 


where %p is an integer. It is possible to express this solution as the sum of two 
other solutions v, =v 9c", and w, =woc” where for the modulus m, (v) is a null 
sequence with the same numeric as (w), and (w) is a purely periodic sequence 
with the same characteristic number. The numbers 7) and wo may be deter- 
mined as soon as % is known. 

It readily follows that the numeric and characteristic number of the se- 
quence (u) modulo m are respectively the least values of m such that 


(2.2) = 0 (mod m), wo(c* — 1) = 0 (modm). 


In the special case when m is a prime p and w» is not divisible by p, the 
least value of m for which the second of these congruences is satisfied is simply 
the exponent to which c belongs modulo p. A complete solution of our funda- 
mental problems is thus at present out of the question even for a difference 
equation of order one. Nevertheless it is of considerable interest to reduce 
the general problem to its basic constituents. A short analysis discloses that 
in order to determine the minimal values of m in (2.2) it is sufficient to know 


(i) the decomposition of m, v9, wo and c into their prime factors; 
(ii) the least value of m such that 

(mod p) 
for every prime factor p of m; 


(iii) if \ is the least value of m satisfying (ii), the highest power of p dividing 


* Compare Carmichael, pp. 345, 346. 
¢ Compare Engstrom, p. 218. 


| 
| 
| 


602 MORGAN WARD [July 


Furthermore, (i) alone suffices for the determination of the numeric of 
(u), and (i) and (ii) alone for the determination of the characteristic number 
of (u) for all square-free integers m. (ii) is the unsolved problem of determin- 
ing the exponent to which a given integer belongs for a given prime modulus, 
while (iii) is equivalent to the (unsolved) problem of the quotients of F ermat: 
to find the highest power of p dividing c?-! —1. 

Let us pass now to the general case of a difference equation of order k. 
Let 

F(x) = — — 
denote the polynomial associated with the difference equation (1.1), and (x) 
as before any solution of (1.1). Then we can associate with (1.1) and m two 
congruences analogous to (2.2): 

V(x)x" =0 (modd™m, F(zx)), W(x)(x*— 1) =0 (modd m, F(x)), 
where V(x) and W(x) are two polynomials whose coefficients may be deter- 
mined as soon as the & initial values of (uw) are known. The numeric and 
characteristic number of (wu) modulo m are respectively the least values of n 
such that the first and second of these congruences are satisfied. 

The central result of this investigation is that these minimal values of n 
may be determined in general provided that we know the following: 

[i] (a) the decomposition of m into its prime factors; 

(b) the Schénemiann decompositions* of F(x), V(x) and W(x) modulo 
where is a prime factor of m; 

[ii] for every prime factor p of m and every irreducible polynomial factor 
¢(x) of F(x) to the modulus , the least value of m such that 

=1 (modd 9, $(x)); 

[iii] if X is the least value of satisfying [ii], the polynomial L(x) defined by 

— 1 = (modd 9’, ¢?(x)). 

We have then a complete analogy with the case of a difference equation 
of order one. Corresponding to (ii), [ii] is the unsolved problem of deter- 
mining the period of a mark in a Galois field, while [iii] is a kind of general- 
ization of the problem of the quotients of Fermat.f 

The methods employed are elementary in the sense that no use is made 
either of the theory of ideals or the “fundamental theorem of algebra.” In- 
stead free use is made of polynomial congruences to single and double moduli 
in the spirit of Kronecker’s theory of algebraic fields. The difficulties in the 
algebraic treatment due to discriminantal divisors are thereby evaded.{ 

* See Fricke’s Algebra, vol. 2, Braunschweig, 1928, chapter 2, and §7 of the present paper. 


¢ Compare Ward, p. 161. 
¢~ Compare Engstrom, p. 211. 


1933] LINEAR RECURRING SERIES 603 


3. We shall adopt the following terminology in this paper. The term poly- 
nomial is restricted to mean a polynomial with integral coefficients; if the 
leading coefficient of the polynomial is unity, it will be said to be primary. 
We designate polynomials by A(x), B(x),---, U(x), V(x),---, 0(x), 
¢(x), --+.A polynomial is said to be divisible by an integer m when and 
only when all of its coefficients are divisible by m. The notations Res { A(x), 
B(x) } and (a, b, - - - ) will be used for the resultant of two polynomials A (x) 
and B(x) and the greatest common divisor of two or more integers a, b,--- . 

If (a) is the reduced sequence corresponding to the solution (u) of (1.1) 
modulo m, and if uw is a period of (a), we shall say that (~) admits the period 
nu (modd m, F(x)) where it will be recalled that F(x) =x*— --- —c, is the 
polynomial associated with the difference equation (1.1). In like manner, we 
shall refer to the characteristic number of (u) as its characteristic number 
(modd m, F(x)) whenever it is necessary to bring m and F(x) in evidence. The 
notation 

(u) = (v), («) = (a) (mod m),0 Sa<m, 
is self-explanatory. 

The following convenient definition was introduced by H. T. Engstrom*: 
A number z is said to be a general period of the difference equation (1.1) for 
the modulus m if every sequence of rational integers (~) satisfying (1.1) has 
the period z. Let 7 be the least such general period for the modulus m. Then 
it is easily seen that every other general period is a multiple of 7, and that the 
characteristic number of any particular sequence (wu) is a divisor of r. We 
shall call + the principal period of the difference equation (1.1) (modd m, 
F(x)). It possesses the following important property: 


THEOREM 3.1. There exist solutions of (1.1) whose characteristic number 
modulo m is the principal period of (1.1). 


Let (u) and (w) be any two solutions of (1.1). Then if we can determine 
integers bi, be, - - - , b, such that 


Un = + + (mod m) n= 0, 1, 


the characteristic number of (w) will be a period of (u). Owing to the linearity 
of (1.1) these congruences will hold for every ” provided that they hold for 
n=0, 1, 2,---, k—1. But a sufficient condition that the k congruences 


* Engstram, p. 210. 


| 
(mod m) 

| 


604 MORGAN WARD [July 


have integral solutions J, - - - , b; is that their determinant be prime to m. 
For that particular sequence (w) with the initial values wo=wi= --- =wWy-s 
=0, w,-1.=1, this determinant has the value (—1)*. 

Hence the characteristic number of (w) is a general period of (1.1). But 
the characteristic number of (w) must divide the principal period. Hence it is 
equal to it. 

Thus the principal period is the least upper bound of the characteristic 
numbers of all solutions of (1.1), and the determination of the characteristic 
number of (w) gives the solution of the second fundamental problem men- 
tioned in the introduction. 


Coroziary. If (u) is any solution of (1.1) and if A(u) denotes the deter- 
minant 


Uk, * ** » 


then if A(u) is prime to m, the characteristic number of (u) is the principal 
period of (1.1). 


As an application of this corollary, consider the solution (s) of (1.1) with 
the initial values s9 =k, s:=¢i, S2=¢:?+2ce and so on, so that if the discrimi- 
nant of F(x) does not vanish, s, is the familiar sum of the mth powers of the 
roots of F(x) =0. It is well known that A(s) equals the discriminant of F(x). 
Hence the characteristic number of (s) is the principal period of (1.1) provided 
that m is prime to the discriminant of F(x). 


II. THE RELATIONSHIP WITH THE RING ASSOCIATED 
WITH THE DOUBLE MODULUS 


4. We begin by considering the solutions of (1.1) from a group-theoretic 
stand-point. If we regard any two solutions (mu) and (v) of (1.1) as one-rowed 
matrices we may define their “sum” to be the sequence (+7): 


(u) + (v) = (w+). 


The set of all solutions of (1.1) form an infinite Abelian group with respect 
to the operation of vector addition just defined, the identity element of the 
group being the sequence 


(0): 


uo, 
Ue, 


1933] LINEAR RECURRING SERIES 605 


Denote this group by U and the corresponding finite group of the reduced 
sequences (a) by &. The relationship between these two groups may be con- 
veniently symbolized by writing 

(mod m). 

Now the method of attack upon the fundamental problems mentioned in 
the introduction is to set up an isomorphism between the group % and the ring 
of residue classes associated with the double modulus m and F(x). The problems 
considered are thus transformed into problems belonging to the theory of 
congruences to a double modulus which admit of perfectly definite answers. 

To set up this isomorphism, it is necessary to define the “product” of two 
sequences (z) and (v). How this may be done will be explained in §6; for the 
present, we will confine ourselves to developing the idea of addition of se- 
quences. 

THEOREM 4.1. Every sequence (u) may be uniquely represented modulo m 
as the sum of a null sequence and a purely periodic sequence with the same nu- 
meric and characteristic number. 

Let \ and yp be respectively the numeric and characteristic number of (x) 
modulo m, and suppose that \=—r (mod yw), where O<r<y, so that A+r 
= Qh. 

Set In Wn=Un—Vn (n=0,1,---). 

Then (v) is a purely periodic sequence with the characteristic number yu 
modulo m, and 

(u) = (v) + (w). 


(w) is a null sequence modulo m with the numeric X. For if 20, 


= Untr — Untr = Unga — = 0, 


= — = — O (mod m). 


Such a representation of (u) is unique modulo m; for if there were a second 


one 
(u) = (v’) + (w’) 
we would have (w—w’) =(v’ —v), so that (w—w’) would be a purely periodic 
null sequence. Hence (w—w’) =(0) (mod m), (w) =(w’), (v) =(v’) (mod m). 
It is evident that the set of all null sequences of % and the set of all purely 
periodic sequences of & are both sub-groups of Wf. If we denote these sub- 
groups by 9t and $, we have from Theorem 4.1 
THEOREM 4.2. The group U is the direct sum of N and Y, where N is the 
group of all null sequences of A, and § is the group of all purely periodic se- 
quences of %. 


| 


606 MORGAN WARD [July 


5. If we form from the first m terms of any solution (mu) of (1.1) a poly- 
nomial of degree n—1 in the indeterminate x 


U,(x) = + tyr, 
it is easily verified that we have identically in x 
F(x)U,(x) = { + (uy — +--+ + — 


} { + (tUnt1 — + --- 


+ — Cilngh—2 — * — . 


Denote the two polynomials in brackets by U(x) and U‘™ (x) respectively. 
Then on considering the identity modulo m, we obtain the congruence 


(5.1) x*U(x) —U™ (x) =0 (modd m, F(x)). 
Assume first that (w) is purely periodic modulo m and admits the period 
n. Then U‘™ (x) =U(x) (mod m), so that (5.1) becomes 
(x™ — 1)U(x) =0 (modd m, F(x)). 
Conversely if for some this latter congruence holds, (u) is purely peri- 
odic modulo m and admits the period n. 
Secondly, assume that (u) is a null sequence modulo m of numeric <n. 
Then U‘™(x) =0 (miod m) and (5.1) becomes 
x"U(x) =0 (modd m, F(x)). 
Conversely if for some n this latter congruence holds, (#) is a null se- 


quence of numeric <n. We have thus established the following two basic 
theorems: 


FUNDAMENTAL THEOREM ON PURELY PERIODIC SEQUENCES. If (u) is any 
solution of the difference equation (1.1), then a necessary and sufficient condition 
that (u) should be purely periodic and admit the period n (modd m, F(x)) is that 


(S.2) (x* — 1)U(x) =0 (modd m, F(x)), 
where 
(5.3) U(x) = uox*—! + (uy — + + — — — 


is a polynomial of degree k—1 in x whose coefficients are determined entirely by 
the k initial values of (u) and the coefficients of (1.1), while F(x) is the poly- 
nomial associated with (1.1). 


We shall call the polynomial U(x) which completely determines the k 
initial values of (w) and hence (x) itself, the generator of (u). 


1933] LINEAR RECURRING SERIES 607 


FUNDAMENTAL THEOREM ON NULL SEQUENCES. If U(x) is the generator of 
the sequence (u), then a necessary and sufficient condition that (u) should be a 
null sequence with numeric less than or equal to n 1s that 


(5.4) x*U(x) =0 . (modd m, F(x)). 
We have the following important corollaries to these theorems. 


Corotrary 1. If (u) is a purely periodic sequence modulo m, its characteris- 
tic number is the least value of n for which the congruence (5.2) is satisfied. 


Coro.iary 2. If (u) is a null sequence modulo m, its numeric is the least 
value of n for which the congruence (5.4) is satisfied. 


The generator of the sequence (w) with the initial values 0,0, --- , 0,1 
is unity. Hence we have from Theorem 3.1 


Coro.iary 3. The principal period of (1.1) modulo m is the least value of n 
such that 
=1 (modd m, F(x)). 


6. We are now ready to establish the isomorphism between the ring of 
residue classes associated with the double modulus m, F(x) and the group 
of reduced sequences defined in §4. The ring may be represented by the set of 
m* polynomials 


L(x) = + +---+they (0 m). 


On identifying U(x) of (5.3) modulo m with L(x) we obtain the con- 
gruences 


(6.1) — — Cope — Co (modm),r=0,---,k-1. 
These congruences have a unique solution 
u;=a; 


We associate with L(x) the reduced sequence (a) whose initial values are 
do, @-1, and write 


(a) ~ L(x). 


Since the congruences (6.1) are solvable for the /, for any m, given (a), 
we can determine a unique L(x). The correspondence is therefore a reciprocal 
one. 

Suppose that 


(6) ~ M(x). 


Then evidently 
(a + b) ~ L(x) + M(x). 


} 
| 
} 
il 
4 


608 MORGAN WARD [July 


If L(x)- M(x) =N(x) (modd m, F(x)), we define the reduced sequence (c) 
associated with N(x) to be the product of the sequences (a) and (6). The 
exact dependence of the elements of (c) upon those of (a) and (d) need not 
detain us here. If we write (a)-(b) for the product of the sequences (a) and 


(6), we have then 
(a)-(b) ~ L(x): M(x). 


It is easily verified that the set 2 with the two operations of addition and 
multiplication just defined satisfies the postulates for a ring*; hence we have 
the following result: 

THEOREM 6.1. The set U of reduced sequences modulo m forms a commutative 
ring with respect to the operations of addition and multiplication of sequences 
defined above which is simply isomorphic with the ring R of residue classes 
associated with the double modulus m, F(x). 

If 
(a): Go, @1, 
is any sequence of U, the corresponding element of the ring KR is 

L(x) = + hak? +--- 


where 


Ll, = dp — — — (modm),r=0,---,k—1. 


To examine the nature of this correspondence further, we need the follow- 
ing lemma. 


Lemma. If (u) is a solution of the difference equation (1.1), and if A(u) 


denotes the determinant 
Up—1 


Als) @ 


uk, 
and U(x) the polynomial 
U(x) = + (uy — Cyto) + (ug — Cyt, — Cotto) x**+--- 


+ — — * — 


then (—1)*A(u) is equal to the resultant of U(x) and F(x), where F(x) is the 
polynomial associated with the difference equation (1.1). 


* van der Waerden, Algebra, Berlin, 1930, vol. 1, p. 37. 


1933] LINEAR RECURRING SERIES 609 


The nature of the proof is sufficiently indicated by the special case k =3. 
The resultant of U(x) and F(x) may then be expressed as the five-rowed 
eliminant 
Uy — Cio, U2 — — Coto, 0, 0 
Uo, Uy — Ue — CU, — Coto, O 

uo, — Ug — — Coto|. 


— C2, — Cs, 0 


Ci» — C2; — C3 


Now perform upon E the operations 


row 1 — uw row4 — row5, row2 — urow5S. 


The first two elements in the first three rows of E become zero, so that E re- 
duces to the third-order determinant 


U2, CoM; + 
E=-—| %, Ue — Cit, C30 
Uo, Uy — Ug — — Coto 
From the difference equation, 
Us = Cie + Coty + Cyto, Ug = Cig + Cote + 
Hence performing upon £ successively the operations 
col3 + ¢,col1 + col 2, 
we obtain 


U2, U3, U2, Uz, Us 
E=- U2, C3Uo — | &, U2, U3 (- 1)*A(u). 


Uo, U1, Ug — C4, — Coto Uo, U1, Ue 


THEOREM 6.2. To the units of the ring R correspond those sequences of A 
whose characteristic number is the principal period of the difference equation (1.1) 
modulo m, while to the identity element 1 of Rt there corresponds the sequence (w) 
with the initial values 0,0, , 0, 1. 


For the units of ® are represented by those polynomials L(x) such that 
the resultant of L(x) and F(x) is prime to m. But if L(x) =U(x) is the gen- 
erator of the sequence (uw), we have just seen that A(u) is numerically 


| 
0, 1, 

3 

ve 


610 MORGAN WARD [July 


equal to the resultant of L(x) and F(x). By the corollary to Theorem 3.1, the 
characteristic number of all sequences (u) with A(z) prime to m is the same, 
and equal to the principal period of (1.1) modulo m. The latter part of the 
theorem follows from the fact that for the sequence (w):0,0,---,0,1,--- 
we have W(x) =1. 


III. SIMPLIFICATION OF THE FORM OF THE MODULUS 
AND ASSOCIATED POLYNOMIAL 


7. If m=pi - - - p?” is the decomposition of m into its prime factors, 
then it is easy to see that the ring associated with the double modulus m, 
F(x) is the direct sum of the r rings associated with the double moduli 
pit, F(x). We have of course a similar dissection of the ring & into a sum of 
simpler rings. The following important theorem gives the corresponding re- 
duction of the problem of determining the characteristic number and nu- 
meric of any sequence modulo m to the case when m is a power of a prime. 


THEOREM 7.1. If 
m= p™--- 
is the decomposition of m into its prime factors, then the characteristic number of 
any sequence modulo m is the least common multiple of its characteristic numbers 
modulis p;*i (t=1, - -~ , 7) while its numeric is the maximum of its numerics 
modulis 

It is sufficient to show that if m=a-b where a and D are relatively prime, 
then the characteristic number of (%) modulo m is the least common mul- 
tiple of its characteristic numbers modulo a and modulo 3, while its numeric 
modulo m is the greatest of its numerics modulo a and modulo b. 


Let 
(u) = (v) + (w) (mod m) 


be the unique decomposition of (u) into a null sequence (v) and a purely 
periodic sequence (w). Then since a and b divide m, 


(u) = (v) + (w) (mod a), and (u) = (v) + (w) (mod d). 


Furthermore (7) is a null sequence modulis a and 6 and (w) is a purely periodic 


sequence modulis a and b. 
In view of Theorem 4.1, it is sufficient to prove the result for the numeric 


of (v) and the characteristic number of (w). 
Consider first (v), and let V(x) be its generator, vm, v, and vz its numerics 
modulis m, a and b respectively, and 7 the greatest of y, andv,. Then by the 


fundamental theorem of §5, 


LINEAR RECURRING SERIES 
x’=V (x) = 0 (modd m, F(x)), xV(x) = 0 (modd a, F(x)), 
x*V(x) = 0 (modd 4, F(x)). 


Thus x’™V (x) =0 (modd a, F(x) (and (modd b, F(x)) so that v, 27. But since 
aand bare relatively prime, 


x*V(x) =0 (modd ab, F(x)) 


so that r2vm. Hence 
The proof for the characteristic number of (w) is similar and will be left 
to the reader.* 
We shall assume hereafter that m=", p a prime, WN a given integer. 
Now suppose that 


F(x) = (mod 9) 
is the unique decomposition of F(x) modulo p into a product of powers of 
primary irreducible polynomials ¢(x). Then by Schénemann’s second 
theorem there exists a decomposition of F(x) modulo p* of the form 
(7.1) F(x) = F,(x)-F2(x) - - - F.(x) (mod 
where 


= { } (mod 1,2,---,%, 


and the polynomials F;(x) are primary. We shall refer to (7.1) as a Schéne- 
mann decomposition of F(x) (modulo p”). 

Corresponding to this decomposition of F(x), we have a decomposition 
of the ring associated with the double modulus p”, F(x) into the direct sum of 
the s rings associated with the moduli p”, F;(x). If U(x) is any element of this 


ring, and 
U(x) = (modd F,(x)), i= 1, 


where U“(x) is of degree less than F(x), then U(x) may be uniquely repre- 
sented as 


U(x) = BO(x)U + BO(2)U (x) + BO(x)U (x) (modd F(x)) 
where the B‘®(x) are of degree less than F(x) and 


B® (x) = 1 (modd F,(x)), 
= 0 (modd F(z), fx 


* See Ward, p. 155, Theorem 3.11. 
t See Fricke, work cited, §11. 


| 


612 MORGAN WARD [July 


If (u) is the sequence generated by U(x), (u‘®) and (b“) the sequences gener- 
ated by U(x) and B(x), the analogous decomposition of (u) is 


= (b)- (a) + (b)- +--+ + (mod 


The corresponding theorem for the characteristic numbers and numeric of 
(u) is as follows: 


THEOREM 7.2. Suppose that (7.1) is a Schénemann decomposition of F(x) 
modulo p’, and that U(x) is a polynomial of degree <k—1 in x generating a 
sequence (u). Furthermore suppose that 

U(x) = U(x) (modd F,(x)) 


where U‘(x) is a polynomial of degree less than F(x), and the generator of a 
sequence (u‘) which is a solution of the difference equation whose associated 
polynomial is F (x). 

Then the characteristic number of (u) (modd p%, F(x)) is the least common 
multiple of the characteristic numbers of (u“) (modd p*, F;(x)) and the numeric 
of (u) is the maximum of the numerics of the (u“). 


Suppose that 
(u) = (v) + (w) (mod and U(x) = V(x) + W(x) (modd F(2)) 


are the decompositions.of (~) into a null sequence (v) and a purely periodic 
sequence (w), and the corresponding decomposition of the generator U(x) of 


(uw). Furthermore, suppose that 
U(x) = U(x), V(x) = V(x), W(x) = W(x) (modd F;(x)) 
where the polynomials on the right side of the congruences are of lesser de- 
gree than F,(x), and that (wu), (vo) and (w‘®) are the solutions of the dif- 
ference equation associated with F,(x) with the generators U‘?(x), 
and W‘(x) respectively. Then we may write 
= (0) + (mod 
U(x) = VO(x) + WO(x) (modd p”, F;(x)). 
I assert that (7.2) gives the decomposition of (u“) into its purely periodic 


and null components; for if r and \ are the numeric and characteristic number 
of (u), we have by the theorems of $§$4 and 5 


x'V(x)=0, = W(x) (modd , F(x)). 


(7.2) 


Hence 
(7.3) = 0, PW(x) = (modd F;(x)) 


so that by the theorems of §5. (v‘®) is a null sequence and (w“) is a purely 


} 


1933] LINEAR RECURRING SERIES 613 


periodic sequence. By Theorem 4.1, the numeric of (v) and the character- 
istic number of (w“) are the numeric and the characteristic number of (uw). 
Call this latter number ),; and let uw be the least common multiple of Au, 
he, Ae From the second congruence in (7.3), (w), and hence (u“), 
admits the period \ (modd p”, F,(x)). Hence \; divides so that divides X. 
But clearly 


(x — 1)W(x) = 0 (modd p”, F;(x)) 


so that 
— 1)W(x) =0 (modd F:(x)),i=1,---,5. 
Since the resultant of any two distinct F;(x) is prime to p, these last congru- 


ences imply that 
(x* — 1)W(x) =0 (modd F(x)). 

Hence by the fundamental theorem again, \ divides yu so that A equals pu. 

The proof of the result for the numerics is similar and will be omitted here. 

8. In the present section, we shall solve completely the problem of deter- 
mining the null component and the purely periodic component of any se- 
quence (modd p”, F(x)). 

Let us assume that the coefficient c; in (1.1) is divisible by ». Then in the 
Schénemann decomposition (7.1) one of the F;(x) must be of the form 
xti+pV (x); let us suppose that it is F(x), so that 


Fi(x) = x4 + pV(x). 


The exponent 4, is simply the number of consecutive coefficients c,, 
Cx-1, Ck-2, Which are divisible by p. Let 


F’(x) F2(x) -F;(x) wes F,(x), 


so that Res {Fi(x), F’(x)} is prime to 9. 
By the fundamental theorem of §5, the sequence (m) is a null sequence 
modulo p¥ when and only when the congruence 


x" U(x) =0 (modd F(x)) 


is solvable, U(x) denoting as usual the generator of (w). But this congruence 
is solvable when and only when the two congruences 


x"U(x) =0 (modd p¥,Fi(x)), «"U(x)=0 (modd F’(x)) 


are solvable. The first of these congruences is solvable for any U(x), for we 
may take »=Nt,. The second is solvable when and only when U(x) =0 
(modd p”, F’(x)) for Res {x, F’(x)} is prime to p. We have thus established 
the following theorem. 


¥ 
| 
| 
4 


614 MORGAN WARD [July 


THEOREM 8.1. If in the Schinemann decomposition modulo p of the poly- 
nomial F(x) associated with the difference equation (1.1), 


(7.1) F(x) = F,(x)-Fo(x) - - - F.(x) (mod 


we have F\(x) =x+ pV (x), then a necessary and sufficient condition that a given 
solution (u) of (1.1) be a null sequence modulo p is that its generator U(x) 
satisfy the relation 


U(x) =0 (modd p*, F2(x) - - - F,(x)). 


In this case its numeric is the least value of n such that 
(8.1) x" U(x) =0 (modd p”, F;(x)). 


We can prove the following result in very much the same manner. 


THEOREM 8.2. With the hypotheses of Theorem 8.1, a necessary and sufficient 
condition that a given solution (u) of (1.1) be purely periodic modulo pN is that 
its generator U(x) satisfy the relation 


U(x) =0 (modd p”, Fi(x)). 

The decomposition of (u) into its purely periodic and null components is 

now easily effected. For since Res { F;(x), F’(x)} is prime to p, we can deter- 
mine two polynomials S$,(x), S:(x) such that 

Si(x)Fi(x) + S2(x)F’(x) = U(x) (modd F(x)). 


Suppose that 
S2(x)F"(x) = V(x), Si(x)Fi(x) = W(x) (modd p”, F(x)) 


where the degrees of V(x) and W(x) do not exceed k—1, and let (v) and (w) 
be the sequences generated by V(x) and W(x) respectively. Then 


U(x) = V(x) + W(x) (modd p”, F(x)), (u) = +(w) (mod - 


and (v) is a null sequence and (w) a purely periodic sequence modulo p¥. 


IV. THE DETERMINATION OF THE NUMERIC 


9. If (u) isa null sequence modulo p”, we have just seen that its generator 
is of the form 


U(x) = U'(x)-Fo(x)---F(x) (mod 
and that its numeric is the least value of m such that 
x*U'(x) =0 (modd F:(x)). 


1933] LINEAR RECURRING SERIES 615 
F(x) it will be recalled is of the form x + pV (x). It may happen that V(x) 

is also divisible by p. To conserve generality, we therefore assume that 
F(x) = — p%O(x);  0(x) 4 O0(mod p); 6(x) of degree less than 4. 


By Schénemann’s theorems,* U’(x) has a decomposition modulo p” of 


the form 
U'(x) = (mod p”) 


where 
M20, Gi(x) = + (x); 
Res {G,(x), U"(x)} prime to 0 (mod pj. 


It follows immediately that the numeric of () is the least value of such that 
(9.1) x"Gi(x) =0 (modd F,(x)). 


This minimal value may always be calculated in view of the following two 
theorems: 


THEOREM 9.1. Suppose that a set of polynomials U(x), G(x), ¢(x) are de- 
fined recursively by 


U,-1(x) = G(x) U,1(x) (mod p*1), r = 1,2,---, 
= prU,(x) (mod F;(x)), 
G(x) = x% + pPrt,(x), 
L,= Pr), 


where U,(x) is not divisible by p, U,-s(x) is not divisible by x modulo p, and 
f-(x) is a polynomial of degree less than a, not divisible by p, while Uo(x) 
=G(x)U’'(x), Uo(x) =U"'(x). Then the numbers p are all positive, and after 
a finite number of steps, say 1, we will either have 


NS M+ 1+ po or Res { Ui(x), Fi(x)} prime to p. 


Let | now denote the first time one of these alternatives occurs. Then in the | 
first case, the numeric of (u) is lt,—(o1t+a2+ - - - +c) and in the second case, 
the numeric is lt) - - where is the least value of n such 
that 


(9.2) x*=0 (modd p”, F,(x)). 


* Fricke, work cited, p. 59, p. 65. 


= 


616 MORGAN WARD [July 


THEOREM 9.2. Suppose that a set of polynomials @(x), 6(x) are defined re- 
cursively by 


= (a% + (mod 


where 0,(x) is not divisible by p, and 6,(x) is not divisible by x modulo p, $,(x) 
is a polynomial of degree less than 7, not divisible by p, while r, is the number of 
consecutive coefficients of the zeroth, first, second, --- powers of x in 80,(x) 
which are divisible by p. Then after a finite number of steps, say h, we will either 
have LiSoi1+o2+ - +0, or and Res {6,(x), Fi(x)} prime to p. 

Let h denote the first time one of these alternatives occurs. Then in the first case, 
the least value of n for which the congruence (9.2) is satisfied is 7,=ht,—(ti+7T2 
+ +++ -+7,). In the second case it is qx7n, where q, is the integer next greater 
than or equal to L; divided by o1+-02+ +o. 


The proofs of these theorems are by induction, and are perfectly straight- 
forward though rather lengthy. They will be omitted here, as the important 
result is that the numeric may be calculated if we merely know the Schinemann 
decompositions of U(x) and F(x) quite independently of the calculation of the 
characteristic number. 

The following results are immediate corollaries of Theorems 9.1 and 9.2. 


Cororary 1. If 


F(x) = Fi(x) - - F,(x) (mod p”), 
= — p%6,(x) (0:(x) # 0 mod p) 
is the Schinemann decomposition of the polynomial F(x) mod p* associated 
with the difference equation (1.1), the least upper bound of the numerics of all 


solutions of (1.1) modulo p% is qt, where q is the integer next greater than or 
equal to 


Coro.iary 2.* The least upper bound for the numerics of all difference equa- 
tions (1.1) modulo p® whose t, last coefficients are divisible by p is Nt. 


Coro.iary 3. The least upper bound for the numeric of all difference equa- 
tions (1.1) of order k modulo p% is Nk. 
V. THE DETERMINATION OF THE CHARACTERISTIC NUMBER 


10. In this division of the paper we shall reduce the problem of determin- 
ing the characteristic number of any solution of (1.1) to its constituents in 
the sense explained in the introduction. In view of the results of §7, we may 


* Due to Engstrom, p. 218, Theorer 9. 


1933] LINEAR RECURRING SERIES 617 


assume that m=)" where # is a prime, and that the associated polynomial 
F(x) is of the form 


(10.1) F(x) = {6(x)}* — p0(x) 


where it will be recalled that (x) is primary and irreducible modulo , while 
6(x) is of lesser degree than F(x). 

The results of §8 allow us to assume that (w) is purely periodic. Hence by 
the fundamental theorem of §5, the characteristic number of (u) is the least 
value of m such that 


(10.2) — 1)U(x) =0 (modd F(x)), 


where U(x) is the generator of (x). 
The following easily established theorem* justifies us in assuming that 
U(x) is not divisible by p. 


THEOREM 10.1. If (u) is any solution of the difference equation (1.1), the 
form of F(x) being unrestricted, and if the integer d is a common factor of the k 
initial values of (u), then the characteristic number of (u) to any modulus m is the 
characteristic number of d-'(u) modulo (m/l), where | is the greatest common 
divisor of m and d. 


Suppose that \ is the characteristic number of (u) (modd p*, F(x)), 
so that 


(10.21) (#* — 1)U(x) =0 (modd p”, F(x)) 


and let p* be the first elementary divisor of the matrix of the eliminant of 
U(x) and F(x) corresponding to the prime p. Then I have shown elsewhere} 
that (10.21) implies that 


*—-1=0 (modd p¥-X, F(x)). 
Thus A is a multiple of the principal period of (1.1) modulo p”~*. 


THEOREM 10.2. If the first elementary divisor of the matrix of the eliminant 
of U(x) and F(x) corresponding to the prime p is p*, then the characteristic 
number of (u) (modd p’, F(x)), N>K, is a multiple of the principal period of 
(1.1) modulo 


This theorem is of some practical importance, as it gives us a lower 
limit to the characteristic number of any sequence. The extension to com- 
posite m and F(x) unrestricted is obvious in view of the results of §7. 


* Ward, p. 157, Theorem 5.2. 
t These Transactions, vol. 35 (1933), p. 258. 


618 MORGAN WARD [July 
Since U(x) in (10.2) is not congruent to zero modulo p, we may assume 

that 
U(x) = {o(x)} V(x) (mod p), a > b2 0, 


where Res {¥(x), ¢(x)} is prime to p. Then by Schénemann’s second the- 
orem,t we have 


U(x) = U*(x)V(x) (mod p”) 
where 
(10. 3) U*(x) = {o(x)}* + pi(x),  &(x) of lower degree than U*(x), 


and V(x) =y(x), mod p. 
It follows that the characteristic number of (u) is the least value of n such 
that 


(10.4) (x" — 1)U*(x) =0 (modd p*, F(x)). 


To avoid circumlocutions, we shall refer to this number as the character- 
istic number of the congruence (10.4). 
If N =1, we may replace (10.4) by 


(10.5) x*"—1=0 (modd p, {¢(x)}2-*). 


Suppose that the polynomial ¢(x) is of degree ¢ in x. Then the character- 

istic number of 
(modd 9, ¢(x)) 

is a well known quantity in the Galois field theory]; for it is simply the ex- 
ponent to which belongs the mark associated with a root of ¢(x) =0 in the 
Galois field of order p‘. We shall regard this number as known to us§; it is a 
divisor of pt—1 and hence prime to # and at most equal to p'—1. Let us 
denote it by A. Then there exist polynomials ¢(x) of degree ¢ for which the 
corresponding A equals p*—1; in other words, p'—1 is not only an upper 
bound for A, but it is the least upper bound for A. 

We have then 


(10.6) — 1 = + po (x) 


where ¥(x) and {(x) are polynomials and {(x) is of lower degree than $(z). 
Since the discriminant of x*—1 is prime to #, 


(10.7) ¥(x) #0 (modd p, $(x)) . 
t Fricke, work cited, pp. 65-66. 


t See Dickson, Linear Groups, Teubner, 1901, Part I. 
§ Compare the remarks in §2 of the introduction. 


1933] LINEAR RECURRING SERIES 619 
From (10.6), 
= 1 + (modd 9, ¢?(x)). 
Hence the characteristic number of 
=1 (modd ¢7(x)) 


is pr. But since 


cP = 1+ (mod 9), 


ph is also the characteristic number of (10.5) if 2<a—b<p. 
Proceeding in this manner, we obtain the following result: 


THeEorEM 10.3. If U(x) is the generator of a purely periodic solution (u) of 
the difference equation (1.1) whose associated polynomial is of the form 


F(x) = {$(x)}* (mod p), while U(x) = {¢(x)}°V(z) (mod 9), 
where Res {V(x), o(x)} is prime to p and (x) is irreducible modulo p, then 
the characteristic number of (u) modulo p is p* where the integer q is such that 

pert <a-dbs pe 
and > is the least value of n such that 
1 (modd 9, $(x)). 

THEOREM 10.4. Under the hypothesis of Theorem 10.3, the principal period 

of (1.1) modulo p is ph where the integer r is determined by the condition 
< a < 

and the least upper bound for the principal period is p*(p*—1), where t is the 
degree of the polynomial (x) in x. 

We leave the formulation of the corresponding theorems when F(x) is 
unrestricted in form and m any square-free integer to the reader. 


11. We are now in a position to attack (10.4) in the general case when V 
is greater than one. We have, with the notation of Theorem 10.4, 


(11.1) 1 = pV(x) (mod F(x)) 
where @ is a positive integer, and V(x) is of lesser degree than F(x). If V(x) 


=0, we shall think of o as arbitrarily large. If V(x) ~0, the value of a is fixed 
by the condition V(x) 40 (mod p). Then 


(11.2) U*(x)V(x) = peW(x) (mod F(x)) 


where p is a positive integer or zero, and W(zx) is of lesser degree than F(z). 
If W(x) =0, we assign an arbitrarily large value to p. Otherwise, the value of 
p is fixed by the condition W(x) 40 (mod #). 


620 MORGAN WARD [July 


p may be equally well defined as the largest whole number M such that 
U(x)V(x) =0 (modd p™, F(x)). 


Unless V(x) divides F(~) (when U(x) may be taken so that W(x) =0), p has 
a definite upper boundt depending only on V(x), F(x) and p. 
From (11.1), we deduce that 

—1) 
= (1+ pV (x))* = 14+ (x) + -+ + (mod F(x)). 
Hence from (11.2), 

— 1) = (x) + V(x) +--+ (mod F(zx)), 
— 1) = (modd pete+t+1, F(x), 


save possibly in the case =2, ¢=1, which we shall exclude. From this last 
congruence, we deduce the following theorems: 


THEOREM 11.1. If p is an odd prime, N>1, the characteristic number of the 
congruence (10.4) is if and if NZp+o, where p and 
are determined by the congruences (11.1) and (11.2). 


THEOREM 11.2. If p is an odd prime, the least upper bound for the charac- 
teristic number of the congruence (10.4) for all choices of U* (x) is pd if N<p 
and \pr+’—» if N =p, where p is determined by the congruence (11.2). 


The fundamental problem of finding the characteristic number of any 
linear recursive sequence to any modulus m has thus finally reduced to deter- 
mining the exponents o and p in (11.1) and (11.2). We shall first seek to 
determine p in the case when # is odd and the exponent a in (10.1) is greater 
than unity. 

If u is an indeterminate, and if we let 


u2 
(u) = 


K(u) = 
2/3 374 


HO(2) = H((9)*"), K(x) = = Lo"), 


t These Transactions, vol. 35 (1933), p. 258. 


2 p—2/p—-1 


1933] LINEAR RECURRING SERIES 
and, for uniformity of notation, 
H@”(x) = §(x), 


then it follows by induction on r from (10.6) that for any positive integral 
value of r, 


x" = 1+ pOx(x) + p'Ox(x) + (mod 
where 
(11.3) = H°-(x), O2(x) = KO-Y(x) + (x). 

Now by (10.1), 
= = + 96) = (mod F(x)). 

Therefore 
(11.4) x? = 1+ + + (modd F(x)). 
On comparing (11.4) and (11.1), we have 


(11.41) pV (x) = + Or + pO. —(modd p*, F(z). 


Therefore a necessary and sufficient condition that o be greater than one is 
that (modd F(x)). This congruence is equivalent to 


(11.5) + yr — +... = 0 (modd p, {¢(x)}*), 


which may be looked upon as a condition upon @(x). 
If pp—a>p’— or 6(x) =0 (mod ), the congruence has no solutions. For 
if it had a solution, we would have 


yr '= 0 (modd $(z)) 
contradicting (10.7). If pp» —a<p*-! and @(x) 40 (mod ), (11.5) implies that 
(x) = 0 (modd p, {¢(x)}*), where c = p”'— +a. 


If 6(x) =0 (modd p, {¢(x) }¢+"), we again obtain a contradiction of (10.7). 
Hence 
A(x) = x(x) {(x) }° (mod p), x(x) # 0 (modd p, $(x)). 
On substituting in (11.5), we find that 
(11.6) + 1=0 (modd p, {¢(x)}?™"). 
This criterion can be greatly simplified. For if y=", 


r r—1 


622 MORGAN WARD 


Hence (11.6) is equivalent to 
x(x) {¥(y)}1+1=0 (modd $(y)). 


Since ¥(y)30 (modd #, $(y)), there exists a polynomial #(y) of degree 
less than $(y) such that 


H(y){¥(y) +1 = 0 (modd p, ¢(y)). 
Hence x(x) =0(y) (modd $(y)), so that we may take 


x(x) = 
where 


(11.7) (modd p, $(#)). 
If we let 
= {o(x) Fi(x) = — 
the results we have obtained may be summarized in the following theorem: 


THEOREM 11.3. If p is an odd prime, a>1, the exponent o in (11.1) is gen- 
erally unity. It is always unity if p*—a>p’-', or if 0(x)=0 (mod ) or if 
p’—a>p-, 0(x) 40 (modd p, o(x)). It is greater than unity only when 
F(x) =Fi(x) (mod p*) where the polynomial F,(x) has been defined above. 


The further study of the exceptional case when F(x)=F,(x) (mod 9?) 
would take us too far afield and will not be embarked upon here. The the- 
orems of §13 on the determination of p when a =1 will give the reader an idea 
of the considerations which apply. We do however gain additional insight 
into the close relationship between recurring series and higher congruences 
if we seek to determine the polynomial y(x) in (11.7) which must be known 
(modd p, $(x)) for F:(x) to be well defined. It will be recalled that ¥(x) was 
originally defined as the quotient obtained on dividing x*—1 by $(x).: Hence 
if 

x — 1 = pL(x) (modd p*, ¢?(x)), L(x) of lesser degree than ¢7(x), 
¥(x) satisfies the congruence 
¥(x) = L(x) (modd 9, ¢(x)). 


It is sufficient then for our purpose to determine L(x). 


Now if we set 
= — dyxt—---— di, 


= (mod $7(z)), 
kat 


Wa.i41 = 0 (n = 0,1,2,---), 


[July 


1933] LINEAR RECURRING SERIES 623 


then it is easily verified that the constants w,,, satisfy the following relations: 
= Wn k+1 + (k = 1, l; n= 0, 1, 2, ees ), 


Wnk = (n <b) 


where 6,,:-, is the Kronecker 6. It follows without much difficulty that 
Wo,k, Wi,k, We,k, ** + iS a particular solution of the difference equation 
(11.8) = + + 


For convenience denote the sequence wo,1-1, @1,1-1, W2,1-1, Whose initial 
values are 0,0,--- ,0, 1 simply by (w). Then we may write for a fixed k 


Wak = Doce 
j=l 
where the c,; are integers determined by the / equations 


1 
= 
j=1 


Wx) = Dicesx!-*, 


k=l 


W ;(x) is a polynomial of degree /—1 in x with integral coefficients, which we 
may regard as known to us. Then 


l 


l 
k=1 k=l j=l 


l 
= 


j=1 
Hence 
pL(x) = + +--- + wWi(x) +1 (mod ¢7(x)) 


so that L(x) is determined if we know the residues modulo p? of the / terms 
Wr41-1, Wr4i-2,° +, W Of the solution 0, 0, ---, 0, 1, di,-- - of (11.8). 
There seems to be no way of obtaining these residues short of calculating the 
whole sequence (w) modulo p? step by step out to \+/ terms. Such a calcula- 
tion will at the same time determine A after at most p‘—1 terms have been 
found. 

12. Weare now in a position to study the value of p in (11.2) in the general 
case when ¢ =1. We have from (10.3) and (11.41) 


= 
| 
Thus if 


624 MORGAN WARD [July 
(12.1) U*(x)V(x) = + + + £0: + 
(modd F(x)). 
Hence p is greater than zero when and only when 
+ = (modd 9, F(x)); 
that is, when and only when 
(12.2) + "(1 — +--+) =0 (modd p, {¢(x)}*). 


If pp—a+b2a, pr+bea, (12.2) is satisfied for any: choice of 0(x). In 
the contrary case, it is either insolvable or imposes a condition upon @(z). 
We find in fact that there are no solutions in any one of the five following 
cases: 


(i) —atb2 a, <4; 
Gi) p+) <a, p'—a> pr; 
Gi) = 0 (mod p), + b <a; 
(iv) —at+b<a,p'+5 <a, pr—a, 
A(x) x(x) {o(x)} +2 (mod 9), 


where x(x) { (x) =0 (modd 9, 


(v) p’ #0 (modd p, { (x) } 


Thus generally speaking, if c=1, p=0 unless b2>2a—p". 
Passing to this case, we have from (10.1), (11.21) and (12.1) 


U*(x)V (x) = ployee? + opr” — + 

+ "(1 — Ayr 

where the last group of terms within the bracket must be replaced by 
ot (x)(1—Yo+ ---) if r=1, and the exponents d and e in the first two 
groups of terms are 20 and have the values p*—2a+), p*-!+b—a. 

Hence p=1 unless the expression in brackets above is congruent to zero 
(modd p, F(x)) or 


12.3 


1933] LINEAR RECURRING SERIES 625 


where E, F, G, H denote polynomials in x which are not ‘congruent to zero 
(modd p, ¢(x)) with integral coefficients:modulo p. The térms ¢°+2?"*E 
must be replaced by if r=1. 

It is not difficult to show that the lowest exponent of ¢ occurring in (12. 3) 
is either d or e so that (12.3) imposes a condition upon (x) of the type ap- 
pearing under (12.2), 


A(x) = { d(x) } (mod p). 


The exponent g here depends upon the relative magnitudes of a,’ b, ?", 
p’-’, p’-* but may be shown to be positive. We may therefore state the follow- 
ing theorem: 


THEOREM 12.1. If p is an odd prime, F(x) ={¢(x)}*+p0(x), a> 1, 
6(x) £0 (modd p, o(x)), then p in (11.2) is unity if p’+b=2a, and 
zero otherwise. If =0 (mod p is zero.if and if p-'+b2a 
it is unity unless both and are and 6(x) satisfies a special 
condition. If 0(x) =0 (modd p, ¢(x)) 40 (mod p), the same results usually apply 
unless F(x) is of a special form similar to that of F,(x) in Theorem 11.4. 


13. We shall conclude by discussing the case when the exponent @ in 
(10.1) is unity so that 


(13.1) F(x) = ¢(x) — p0(x). 


A necessary condition for this to hold is that » should not divide the discrimi- 
nant of F(x). Hence if this discriminant is not zero, the results of this section 
will apply to the powers of.all primes save a finite number. 

If the sequence (x) is not divisible by p, Res { U(x), F(x)} is necessarily 
prime to #, so that the characteristic number of (u) modulo ” is the prin- 
cipal period of. (1.1), and"hence the characteristic number of the congruence 


=1 (modd p”, F(x)). 

With the notation of §10, let \.be the characteristic number of the con- 
gruence 

(modd 9, $(x)), 


so that we have identically in x 


— 1 = o(2)6(x) + pr (2), 
(13.2) (modd p, 6(2)), 
V(x) #0 


We shall now establish the following comprehensive theorem: 


626 MORGAN WARD | [July 


THEOREM 13.1. Let p be an odd prime, o(x) an irreducible polynomial 
modulo p, and suppose that the polynomial F(x) associated with the difference 
equation (1.1) is of the form (13.1). Furthermore, let 


F2(x) = o(x) — p0i(x) 
where &(x) =0,(x) is a solution of the congruence 
¥(x)i(x) + = 0 (modd $(x)), 


v(x) and {(x) being givent by (13.2) . 

Then if F(x)#F.(x) (mod p”), the characteristic number modulo p*® of any 
solution of (1.1) which is not divisible by pis p¥—"h, where dis the least value of 
n such that 


"= 1 (modd p, $(x)). 
On the other hand, if F(x) =F:2(x), mod p?, there exists a set of polynomials 
F,(x), Fs(x),---, Fr(x),---, depending only upon p, $(x), and §(x), 


such that if F(x)=F r(x) (mod p7), #Frii(x) (mod p?t?), the characteristic 
number is \ or pX-T) according as NST or NET. 


We have 
— 1 = Y(x)F(x) + + ¢(x)). 


Suppose first that 0(x)y(x)+¢(x) 40 (modd p, ¢(x)). Then 
x =1+ pK(x) (mod F(x)) 


where K(x) is of lesser degree than F(x) and not divisible by p. On raising 
this last congruence to the p’th power, we obtain 


(13.3) + ptiK(x) + +--+ (mod F(x)). 
Hence if » is an odd prime, 
x? = 1+ (modd p’+?, F(x)). 
But clearly 
(modd p*+?, F(x)). 
Since the characteristic number of (13.1) for N =r+2 is a multiple of its 


characteristic number for VN =r+1, it is exactly equal to p”—X. 
Now let us assume that 


} They may be determined sufficiently to define F(x) by the procedure sketched in §11. 


LINEAR RECURRING SERIES 627 


¥(x)0(x) + = 0 (modd p, $(x)). 


This congruence has a unique solution modulo p of degree less than ¢(x). 
Let us denote it by 6;(x), and set 


Then if F(x)#F.(x) (mod p?), 0(x)#6:(x) (mod p). Consequently 6(x) 
+{¢(x) 40 (modd p, $(x)) and the argument just given is applicable. Assume 


then that 
F(x) = F,(x) (mod 


Consider the polynomials 
F(x), Fe(x), F(x), , Fi(x), 


defined by the recursive relations 


Fi(x) = o(x) — pOx-i(x), Ox(x) = Ox_i(x) + Oo(x) = 0, 
¥(x)Ox_i(x) + (x) = p* (x) (modd p*, Fi.(x)), 


(13.4) 
¥(x)0.(x) + r.(x) = 0 (modd = 1,2,3,---. 


These relations are consistent with one another; for if k =1 they give F(x) 
=¢(x) and for k=2 they give the polynomial F(x) defined above. If we as- 
sume that they are consistent for k=1, 2, 3,- - -, s it easily follows that they 
are consistent for k=s+1. 

Now suppose that 


F(x) = Fr(x) (mod p"), A Fryi(x) (mod p**"), T 2 2. 


Then 
x* — 1 = 0 (modd 9’, F(x)), ¥ 0 (modd p7*!, F(x)). 


For by (13.2) and the relations (13.4), 
— 1 = o(x)o(x) + pe(x) = W(x) {Fr(x) + pOri(x)} + pe(x) 
= ¥(x)Fr(x) + + £(«)) 
= p(¥(x)Or_i(x) + ¢(x)) (mod Fr(x)) 
= p: p™"rr_i(x) (modd Fr(x)) 
= 0 (modd Fr(x)), (modd F(x)). 
In like manner it can be shown that 
o—10 (modd p7+!, F(x)). 


+ The ©@(x) here have no connection with those of §11. 


1933] 


628 MORGAN WARD 


Hence we have 
= 1+ pK(x) (mod F(x)), 
where K(x) #0 (modd , F(x)). On raising this congruence to the appropriate 
power, we find that whether ~ be even or odd the characteristic number is 
or according as N2T or NST. 
The case p=2, T=1 demands separate treatment. If 0(x)y(x)+¢(x) 
=K(x)#40 (modd 2, ¢(x)), we obtain from (12.3), on putting p=2, 
= 1 241 K(x)(1 + (2° — 1) K(x) +--+) = 1+ 2'K(x)(1 — K(x)) 
(modd 2°+?, F(x)). 
If K(x) #1 (mod 2), the previous argument for p odd is applicable. But in 
case K(x) =1 (mod 2), the characteristic number is a divisor of 2°X. 
Since K(x) is of lesser degree than F(x), the most general assumption is 
that 


K(x) + 1 = 2*Z(x) where L(x) 4 0 (mod 2). 

Then 
(13.5) = — 1+ 2**'!L(x) (mod F(x)), 
= 1 (modd 2*+?, F(x)). 


Hence if WV =1, the characteristic number is A, while if s+2 2 N >1, the char- 
acteristic number is 2X. On raising (13.5) to a power of 2, we find that if 
N =s+2, the characteristic number is 2”~*~X. 

These results determine the characteristic number in the excluded case 
of (11.1) when o=1 and p=2 for all F(x) of the form ¢(x)—20(x). The 
further discussion of the characteristic number for powers of 2 demands a 
special treatment which will be given elsewhere. 


CALIFORNIA INSTITUTE OF TECHNOLOGY, 
PASADENA, CALIF. 


SETS OF k-EXTENT IN n-DIMENSIONAL SPACE} 


BY 
R. L. JEFFERY 


1. Introduction. Let A be any point set on the bounded m-dimensional 
domain D. In his development of the theory of measure C. Carathéodoryf has 
defined in connection with the set A a measurable set A which has come to be 
called the massgleiche Hiille of A.§ 

Let B, be a sequence of open sets containing A such that B, > B41, 
and lim uB,=yu*A. Then the set 


A = 


contains A, and is measurable with 


wA = 

The number of ways of selecting each B, is more than countable, and no 
rule is given for any particular choice. Consequently it is impossible to say of 
every point of the domain D whether or not it belongs to the set A. This 
amounts to saying that the set A is not well-defined. 

It is possible to replace the set A by a set A’ which is effectively defined. 
Let B be the complement of A on D. Let w, be a sequence of cells with a 
point 6 of B as center, and with equal side lengths tending to zero as k in- 
creases. Let 


Aw, 


b, wr) = 
p(b, wx) 
Since at each point of B except at most a null set the outer metric density of 
B is unity, it follows that p(b, w,) is defined for all values of & at almost all 
points of B. Let C be the part of B for which p(d, w:) is defined for all values 
of k, and for which 


Tim p(d, wz) > 0. 


The set 
A'=A+C 


Tt Presented to the Society, March 25, 1932; received by the editors June 14, 1932, and, in re- 
vised form, November 18, 1932. 

t Uber das lineare Mass von Punktmengen, Gottinger Nachrichten, 1914. 

§ Hahn, Theorie der Reellen Funktionen, p. 435. Carathéodory, Vorlesungen iiber reelle Funk- 
tionen, p. 260. 


629 


630 R. L. JEFFERY [July 


contains A, and is effectively defined in terms of A. We also have the follow- 


ing: 
I. The set A’ is measurable in the sense of Carathéodory, and 


pA’ = p*A. 
II. A necessary and sufficient condition that A be measurable is that 
p*C = 0. 


Though not explicitly stated, the proofs of I and II are contained in a 
previous discussion.t 

An analogous situation exists in connection with plane sets of linear 
extent. Let A be such a set with finite outer linear measuref{ equal to /. 
Let U,=Uni, Una, « - - be a sequence of open convex areas which contains A, 
with d,,;, the greatest diameter of u,;, tending to zero, and with }-d,; tending 


to 1. Then the set 
A = 


contains A, is linearly measurable,§ and 
LA = L*A. 


In this case too there is no way to determine for each point of the domain D 
containing A whether or not it belongs to the set A. In the present paper we 
determine a set A’ which contains the plane set of linear extent A, which is 
linearly measurable, with 


LA’ = L*A, 


and which is well-defined in terms of A. 

That these sets A’ are well-defined has some significance.|| A more important 
consideration is, however, that the concepts involved in and leading up to 
their definition combine to form an elegant and very useful tool for handling 
certain types of problems.{[ We do not restrict ourselves to plane sets, but 
carry through the discussion for sets of extent & in n-dimensional space. We 
show that such sets have properties of density similar to the properties of 
density which Besicovitchtt and Sierpinskiff respectively have shown to hold 


7 Annals of Mathematics, (2), vol. 33, pp. 449-451. 

t Carathéodory, Géttinger Nachrichten, loc. cit., §23. 

§ Carathéodory, Géttinger Nachrichten, loc. cit., §28. 

|| Sierpinski, Fundamenta Mathematicae, vol. 2, pp. 112. 

{| See Annals of Mathematics, (2), vol. 33, pp. 452-459, these Transactions, vol. 34, p. 650, also 
the concluding section of this paper. 

tt Mathematische Annalen, vol. 98, p. 422. 

tt Fundamenta Mathematicae, vei. 9, p. 172. 


1933] SETS IN n-DIMENSIONAL SPACE 631 


for linearly measurable plane sets, and for plane sets which are not neces- 
sarily linearly measurable but which have linear extent. Although there would 
be no difficulty in giving independent proofs of the various results, to con- 
serve space we have, whenever possible, based our proofs on those of Cara- 
théodory and Besicovitch. 

Let S, be the m-dimensional euclidean space, and S, a k-dimensional flat 
spacet in S,. Let U be an open convex{ domain of S,. For a given U let S; 
be such that the k-dimensional measure of S;,U is a maximum, and denote 
this maximum measure by /,;. We shall call /; the greatest diameter of U, 
and denote it by d. Let A be any bounded set in S,, and p any positive num- 
ber. Put A in a countable set of open convex domains u;. Let L(A) be the 
lower bound of >°J,‘ for all possible such enclosures with d;<p. Evidently 
L(A) does not decrease as p decreases. Let 


L,(A) = lim Ly (A). 
p—0 


It is clear that L,(A) 20, and may be infinite. 

The largest value of k for which L,(A) #0 determines the extent of the 
set A, and the number L;,(A), finite or infinite, is the outer k-dimensional 
measure of the set A. If for each arbitrary set W of extent k 


L,(W) = L.(AW) + Li(W — AW), 


the set A is measurable. This definition of measurability, which is based on 
that of Carathéodory for sets of linear extent, coincides with that of Lebesgue 
for n-dimensional sets. But not all such sets are measurable in the sense of 
Lebesgue. Likewise not all sets of extent k are measurable in the sense of 
Carathéodory. An obvious example is a linear set in the plane which is non- 
measurable in the sense of Lebesgue. 

The theory developed by Carathéodory for linear outer measure, and for 
measurability when the set A is measurable, is easily shown to hold for the 
measure function L,(A). For convenient reference we recall such results of 
this theory as we shall have occasion to use. 

CI. If the sets A and B are of extent k, and if A contains B, then 


L(A). 
CII. If A is the set each point of which is on one of the sets A1, Az, - - - , then 
S Li(A1) + Li(A2) 


t A space which by a proper choice of coordinate axes can be represented by 1.=x2= °° 
Xn_=0. A domain U is convex if every S2U is convex. 


= 


632 R. L. JEFFERY [July 


CIII. If A:, is sequence of sets such that A, contains Ay-1, 
and A is the limit set, then 
lim = L,(A). 
CIV. If A and B are such that every point of A is a distance not less than 
5>0 from any point of B, then 


L(A) + Li(B) = L(A + B). 
2. Some general lemmas. In this section we prove three lemmas. 
Lemna I. If A is such that L,(A) is finite and different from zero, then 
Ly-1(A) is infinite and Liyi(A) =(0. 


That Li+4:(A) =0 follows readily from the fact that, for any set of domains 
u; with d;<p, >-li41<p)_]é. We then have L,_:(A) infinite. For a supposition 
that L,_,(A) is finite makes L,(A) =0. 


Lemma II.} Let V=Vi, Va, - - - be am infinite sequence of open convex 
domains in S,, and A any set of points. Then 
Li(AV) + L(A — AV) = L(A). 


First let V consist of a single domain, and let Ui, U2, - - - be a sequence of 
closed domains interior to V and such that U, contains U,_; and lim U,=V. 
Let A,=AU,. Then lim A,=AV. The sets A, and A—AV are on closed 


mutually exclusive domains. Hence these two sets satisfy the conditions of 
CIV, and it follows that 


Li(A,) + L(A — AV) = Li(An +A — AV) S L,(A). 
And since by CIII lim L,(A,) =L;(AV) we have 
L,(AV) + — AV) S L,(A). 


But by CII 

L,(AV) + Li(A — AV) 2 L(A). 
These two inequalities give the Lemma for V a single region. The extension 
to the case where V consists of a finite number of regions is obvious. When 
V =, , Set Va=%, Ua, Mn. Then 


(1) Li(AVn) + Li(A — AV,) = L;(A). 


¢ It has been remarked by Mr. J. F. Randolph that Lemma II follows from the definition of 
Carathéodory for k-dimensional measurability, provided the open set V in S, is considered to be 
k-dimensional measurable in the sense of Carathéodory, with infinite measure if k<n. In this con- 
nection we note that if k<m every open set V in S, does not satisfy the criterion of measurability 
which is obtained for sets of finite extent in Theorem XII of the present paper. 


1933] SETS IN 2-DIMENSIONAL SPACE 633 


The set A—AV, tends to A—AV, and the set AV, tends to AV. And since 
AV, contains AV,_; it follows from CIII that 
lim L;(AV,) = L,(AV). 

Hence 

(2) L,(AV) + lim L(A — AV,) = L(A). 

It follows from CI that 

L(A — AV) S lim L,(A — AV,). 

Suppose the equality sign does not hold. Then from (2) we get 
+ Li(A — AV) < L(A), 

which, by CII, is not true. Hence 
L,(AV) + Li(A — AV) = L,(A), 


and the Lemma is proved. 


Lemma III. Let V(p) denote any finite or countably infinite set of open con- 
vex domains V,, V2, - - - with greatest diameter d;<p. Then to any set A of 
extent k and any positive number «¢ there corresponds a number pi:>0 such that 
for any set V(p) with p <p; the inequality 

Ly[AV(p)] < +6 
is satisfied. 

This Lemma has been proved for linearly measurable plane sets by Besico- 
vitch.f His inequality (2) follows from the measurability of the set. But the 
corresponding inequality for any set follows from Lemma II above. The 
remainder of the argument is similar to that of Besicovitch with }-]‘ replac- 
ing >.d; for the various regions involved. 

3. Density. Let A be a set of extent k, a any point of A, and H(a, r) an 
n-dimensional hypersphere with center a and radius r. Let kh, be the k- 
dimensional measure of the maximal k-dimensional flat space that can be in- 
scribed in H(a, r). Let 


Ik [A H(a, r)] 
hig 


D(a, r) = 


and let D*(a) and D,(a) be the upper and lower limits respectively of D(a, r) 
as r tends to zero. These numbers are respectively the upper and lower 
densities of A at a. 


t Loc. cit., p. 427. 


634 R. L. JEFFERY [July 


It has been shown by Besicovitchf that the linear measure of a plane set 
depends on the type of region u; used in estimating this measure. A similar 
state of affairs is to be expected for sets with extent greater than unity. In 
estimating outer measure we shall take into consideration only types of re- 
gions u; which are such that 


Lg = dxd*, 
where ¢, is a constant depending on & and on the particular type of region, 
and d is the greatest diameter of the region. For a hypersphere H(a, r) let 
= ,(2r)*. 
We then have 


THEOREM I. [f A is any set of extent k, then for almost all points of A, 


1 
< D*(a) £1. 
2* ve 
Let E be the part of A for each point of which D*(a) >1. Let Ey be the 
part of E about each point e of which there exists a sequence of hyperspheres 
H(e, with 
L,[AH(e, ri)] 


(1) 


By Lemma III there exists p>0 such that for any set of hyperspheres H; 
with r;<p we have 


(2) < +e. 


From the set of hyperspheres defined in (1) let those with 7;>p be discarded. 
It is then possible to use Vitali’s argument to show the existence of a count- 
able non-overlapping sequence of the remaining hyperspheres of H; which 
contain almost all of Z,. From (1) for this sequence we have 


(3) +) < Li(A). 
But from (2) we get 
(4) > — €. 


But for \ sufficiently small L,(£,) >0, and € can be taken arbitrarily small 
independently of A. This makes (3) and (4) contradictory, which proves 
that D*(a) <1 for almost all A. 


T Loc. cit., p. 459. 


1933] SETS IN n-DIMENSIONAL SPACE 635 


To complete the proof of Theorem I we notice that the argument of 
Besicovitcht may be used to establish the existence of a part A: of A with 
L,(A;) arbitrarily near to L,(A) such that about each point a; of A; there 
exists a hypersphere H(a, d) for which 

oxd* 
Li{AH(au, d)} = 
(a )} 
where 7 is arbitrary and d<7. 
From this we get, by dividing by /;4, 
L,[AH(a, d)] 


= 


Completing the argument along the lines followed by Besicovitch we finally 
arrive at 


1 


2* ve 


D*(a) 2 


at almost all points of A. 

We note that if hyperspheres are used in computing the outer measure of 
A then we get 1/2* as a lower bound for the upper density at almost all points 
of A. 

If, at a point a of the set A, Dy (a) =D*(a) =1, then A is regular at a. 
Otherwise A is irregular at a. The existence of sets of extent & which are regu- 
lar at almost every point is obvious. Besicovitcht has shown the existence of 
linearly measurable plane sets which are irregular. An evident modification 
of his methods may be used to construct sets of extent k =2 in S; which are 
irregular. 

TuHeoreoM II. If A is any set of extent k then the part of A for which D*(a) =0 
has zero measure, regardless of the type of region used in estimating L,(A). 

For the sake of simplicity we prove this for a set A of extent two in three- 
dimensional space. With suitable notation the method may be used to obtain 
the same result for sets of extent greater than two. 

Let A; be the part of A for which 

I: [A H(a, r) ] 

hy 
For 6 sufficiently small L2(A;) >0. Put As in wm, ue, - - - where d;<6/2*/?, and 
<L,(As) +e. About u; circumscribe a rectangular parallelepiped p; with 


<er<. 


t Loc. cit., pp. 428-429. 
t Loc. cit., p. 431. 


636 R. L. JEFFERY [July 


longest side parallel to a greatest diameter d; of u;. Then, since u; is convex, 
the-maximal plane section of p; is not greater than 4/, . Circumscribe p; by a 
cylinder C; with axis parallel to longest side of p;. Then the measure q; of a 
cross section of this cylinder through the axis is not greater than 4/, . It is 
evidently possible to cover a part of C; with cylinders C;; with length #;; 
and radius r;; where ¢;;=2r;;<2d,, 8l2 , and where 
L,(AsCi;) > L(As)/2. Fix any point of A; in C;;, and with this point as center 
construct a sphere H(a;, 2¢;;) with radius 2¢;;. H(as, 2¢;;) then contains C;; 
and hy**ii = 41 (2t;;)? = 167q;;. From these and (1), since the greatest linear di- 
mension of C;; is less than 2*/*d; <6, we get 


ie 


he? tii 


which gives 


< €128r[Li(As) +e] < . 


But \ can be fixed greater than zero, and ¢ can be taken arbitrarily small 
independent of }. We are thus led to a contradiction, which proves the 
theorem. 

4. Separated sets. A point set A is separated from a point set B, if a part 
of A can be put in a set of open convex regions a in such a way that 


(1) Li(aB) < and — aA) <e. 


THEOREM III. If A is separated from B, then B is separated from A. 


It follows from CIII that if a=a1, a2, - - satisfies (1), and if a’ =a, 
* * , then for sufficiently large 


Li(a’A) > L,(A) — 
Let Vin Where 2;; is a closed region interior to a;, such that 


contains and lim v;;=a;. Then by CIII, for 7 sufficiently large, we 
have 


(1) L,(AV,) > Li(a’A) — € > — 3e. 


Put the part of B exterior to a in 8 so that any point of 8 is distant from V; 
by not less than 6>0. Then 


— BB) S Li(aB) <«, 


1933] SETS IN n-DIMENSIONAL SPACE 


and, by CIV, 
Li(ViA) + = + BA) L(A), 
which, with (1), gives - 
L,(BA) < 3e. 
THEOREM IV. If A and B are separated sets, both of extent k, then 
+ Li(B) = + B). 
Put A in a set of open convex regions a so that 
(1) Li(aB) < ¢, L(A — aA) <e. 
Set E=A+B. Then, by Lemma II, 
(2) L(aE) + Li(E — aE) = L,(£), 
which, from CI, gives 
(3) Li(aA) + Li(B — aB) S L,(Z). 
But from Lemma II we get 
(4) Ly(aA) + — aA) = L(A), 
and 
(5) Li(aB) + Li(B — aB) = L,(B). 
It then follows from (1), (3), (4), (5), and the fact that ¢ is arbitrary, that 
L(A) + Li(B) S + B). 
But by CII 


L(A) + Li(B) = Li(A +B). 


These two inequalities give the theorem. 
Let A;° be the part of A for which 


[BH(a, r)] 
im = 
ro L,[AH(a, r)] 
and A #* the part of A for which 
_— L,[BH(a, r)] 
im ——— 
L,[AH(a, r)] 
Define B,° and By by.interchanging the roles of A and B. 


638 R. L. JEFFERY [July 


The ratios which are used in determining the sets Az® and A, are de- 
fined for almost all A. For otherwise there would exist a part of A with outer 
measure >0 for each point a of which D*(a) =0. But this contradicts The- 
orem II. Likewise the ratios used in determining the sets B,° and By are 
defined for almost all B. 

THEOREM V. The set A, is separated from B, and the set B4° is separated 
from A. 

For a given e¢>0 there corresponds to each point of A,° a number p>0 
such that 


L,([BH(a, r)] 
L,|AH(a, r)] 
If E is the part of Az for which (1) holds, then for p sufficiently small it 
follows from CIII that 

(2) > Li(As’) — «. 

About each point of £ there then exists a sequence of hyperspheres H(e, r;) 
with 7; tending to zero for which (1) holds. Vitali’s argument may now be 


used to show the existence of a non-overlapping set of these hyperspheres 
H =H, Hz, - - - which contain almost all of E. By (2) and Lemma II we get 


— HAs?) <e, 


<er<op. 


(1) 


and from (1) 
DLi(BHi) < 
L,(BH) < eL,(A), 


where ¢ is arbitrary. Thus A,;° is separated from B. In a similar manner it 
may be shown that B,° is separated from A. 

THEOREM VI. There is no part of Ax with outer measure greater than zero 
which is separated from B, and no part of Bat with outer measure greater than 
zero which is separated from A. 

Suppose there is such a part of Ag+. CIII may then be used to show the 
existence of a positive number d and a part E of Ag* such that at each point e 
of E we have 


L,[BH(e, r:)] 


tt) L,[AH(e, r:)] 


for a properly chosen sequence of values of 7; tending to zero. By supposition, 
E is separated from B. It is, therefore, possible to put a part of E in a set of 


1933] SETS IN n-DIMENSIONAL SPACE 


open convex regions a in such a way that 
(2) Li(aB) < ¢, and Li(E — aE) <e. 


About each point of E on a may be put a sequence of hyperspheres H(e, r;) 
satisfying (1) and such that all the hyperspheres are on a. Vitali’s argument 
may now be used to show the existence of a countable non-overlapping set of 
these hyperspheres containing almost all the part of E on a. For this set H, of 
hyperspheres we get from (1) 


>Li(BH,) > d > dL,(E) — «. 
But this contradicts (2). We conclude, therefore, that there is no part of 
A with outer measure >0 which is separated from B. In a similar manner 


it can be shown that there is no part of B4+ with outer measure >0 which is 
separated from A. 

It has now been shown that A ;° is separated from B and no part of A j* is 
separated from B, with similar remarks applying to B,° and By. From these 
facts it is easy, by methods used above, to obtain 


THEOREM VII. A;° is separated from Ay and B,° is separated from Bat. 
We next prove 
THEOREM VIII. L,(A Bt) =L,(B,*) =L,(A 


Suppose L,(A s+) =L.(Bat)+c, c>0. Making use of Lemma III, a 
number p>0 may be so fixed that for any set of open convex regions V =1, 
with d;<p we have 


(1) LiVA#) < +7 
Now let V with d;<p enclose B,* in such a way that 


< Li( Bat) + 
This, with (1), gives 


c 
Li(VAst) < Li( Bast) + 


which shows that there is a part E of A,* exterior to V with L,(E)>c/2. 
But V contains By+. Hence, since E is exterior to V, it may be shown by 
methods used above that EZ, a part of Ast with outer measure >0, is sep- 
arated from B. But this contradicts Theorem VI. We conclude, therefore, 
that L.(Agt) <L.(Bat). Precisely the same argument shows that L,(Ba+) 
<=L;,(Azs*). We thus have L,(A s+) = L;(Bs*), which is the first part of the the- 
orem. 


640 R. L. JEFFERY (July 


Suppose L;(A g++ By+) =L,(As*)+Kc, c>0. Lemma III may be used to 
enclose A g* in a set of open convex regions V =2, v2, -.- - in such a way that 


(2) < Le(Ast) + = 
and 
(3) + BHV] < + 


which shows that there is a part E of Ag++B,* exterior to V with measure 
>c/2. Since V contains A it follows that E belongs to Byt+. Reasoning as 
above, we arrive at the conclusion that E is separated from As. But this 
again contradicts Theorem VI. Thus we conclude that L;(Ag++Bat) <L,(As*). 
Hence, since always => st) =Li( Bat), we have 


+ Bit) = = Bit). 
From Theorems IV, V, and VII, we get 
THEOREM IX. 
= Li(As?) + Li(Ast), 
Li(B) = Li( + Li( Bs), 
+ B) = Li(As?) + Ba?) + Li(Ast + Be). 


Theorems VIII and IX may now be combined to give 
THEOREM X. 
+ Li(B) = + B) + + Be). 
5. Relations between sets in general and measurable sets. Let A be any 


set of finite extent & in S,, B the complement of A. Let C be the part of B for 
which 


740 hit 
THEOREM XI. The set C is of extent not greater than k. 
Suppose there is some integer j =>1 for which . 
Li+i(C) > 0. 


On account of (1) there exist two positive numbers 6 and d, and a part C; 
of C with Ly,;(C:) >0 for which 


1933] SETS IN n-DIMENSIONAL SPACE 


Ly|AH(a, 
r)] 
hit 


(2) d 
for a proper choice of r<6. Since Li4;(C1) >0, it follows from Lemma I that 
there exists a part C: of C, with L:(C2) >G, G an arbitrary positive number. 
Choose a sequence 6;>6,> - - - tending to zero, and let C* be the part of C, 
for which (2) holds for some r>6,. Then Ci tends to C2. Thus there exists 
5’>0 anda part C; of C, with L;(C3) >G for which we have 

Li |AH (cs, 
(3) el (cs r)| 

Ait 

for some r>6’. Now put C; in a set of open convex regions 1, 2, - - - with 
d;<6’, and such that 


> Li(Cs) —€ >G. 


In each u; choose a point c; of C; and about this point put a hypersphere 
H(cs, r) with r;>6’ and satisfying (3). Then 4 >1,4 , and consequently from 
(3) we get 

[AH (cs, ri) | > L,[AH(cs, ri) >d. 


This gives 

> Li [AH (cs, > d > dG. 
But since L;,(A) is finite, and since G can be chosen arbitrarily large, this 
gives a contradiction. Hence our assertion is proved. 


THEOREM XII. If A is any set of finite extent k, then a necessary and suffi- 
cient condition that A be measurable is that L,(C) =0. 


Let W be any set of extent k. We show that if L,(C) =0, then 
(1) L.(W) = L.(AW) + L.(W — AW). 


Set W—AW =E. Since E—EC belongs neither to A nor to C, we have for 
any point e of E—EC 


r—0 h 


Hence, since L;(C) =0, for almost all E we have 


642 R. L. JEFFERY 


But for almost all £ 

hit 

lim 2 

r0 L,[EH(e, r)] 
and this, with (2), gives, for almost all E, 

Lx [AH(e, r)] 

lim = 

L,[EH(e, r)] 
Hence almost all E—EC belongs to W,°. And since L;(EC) =L,(C) =0, it 
follows that almost all E belongs to W,°. Hence E=W—AW is separated 
from A and consequently from WA. The truth of (1) now follows from 
Theorem IV. Then, according to our definition of measurability, A is measur- 
able. Thus the condition is sufficient. 


At every point of C 
L,|AH(c, r)] 
lm ———— > 0 


hit 


Hence for almost all C 
— L,|AH(c, hit 
im 
Li [CH(c, r)] L,[CH(c, r)| 


But for almost all C 
hit 


lim ———— 2 1. 

70 L,[CH(c, r)] 
Consequently, for almost all C, 

L:[AH(c, 

im 

Ly [CH(c, r)] 


We conclude, therefore, that almost all C belongs to C4+. Hence if L.(C) >0 
it follows that L.(C4+) >0. Now let W=Ac++C,. Then by Theorem VIII, 


Li(W) = = Li 
But 
L.(AW) + Li(W — AW) = + Li (Cs) = 2L:(W). 
Hence A is not measurable. This shows that the condition is necessary. 
THEOREM XIII. Let A be any set of finite extent k. Then the set 
A’=A+C 


contains A, is measurable with 


(July 


SETS IN n-DIMENSIONAL SPACE 


L(A’) = L,(A), 

and is well-defined in terms of A. 

A point b of the set B complementary to the set A does or does not belong 
to C according as the upper limit of 

L,[AH(b, r)] 
hit 

is greater than zero, or is equal to zero. Hence C, and consequently A’, is 
effectively defined in terms of A. 


To show that A’ is measurable, let c’ be a point of C’. Then c’ belongs 
neither to A nor to C, and 
r)] 


r—0 hit 


But this, with Theorems IV, VII, and VIII, gives 
{ie H(c’, r)| Lil(At + CH) > 0, 
10 hiv hy 
— + 
& + Ad)H(c r)] >0 


r—0 hit 


which makes c’ a point of C. Hence C’ is empty and L;(C’) =0. Then by 
Theorem XII A is measurable. 


THEOREM XIV. If the set A of extent k is regular then A’ is regular, and if 
this set is irregular then A’ is irregular. 

Let A be regular, and suppose there is a part of A’, other than null parts, 
at which A’ is irregular. Obviously A’ is regular at each point of A. In the 
proof of Theorem XII it was shown that almost all C belongs to C4+. Hence 
there is a part E of C,*+ for which 


r|A’H(e, 


hit 


(1) 


for an infinite set of arbitrarily small r, 7>0, and L;,(E£) >0. Then, since E 
belongs to C+, almost all points of E are points of E,*+. This and Theorem 
VIII then give 


L,(As) = Li(Es#*) = Li (EZ) > 0. 


The set A’ is regular at points of A and consequently at points of Ax*. Let 


1933] 643 


644 R. L. JEFFERY 


F be the part of Ag* which is such that 
L,|A'H 
Inf 2 
for r<6, Li(F)>0. The set F belongs to Agt, and consequently almost all 
F belongs to Fs*. Hence, as above, 
L,(E*) = Li(Fs*) = Li (F) > 0. 
For each point of E*,(1) holds. Hence about a fixed point x of this set there 
exists H(x, r) with r<6 such that 
L,|A’H 
(x, r)] 
hit 


(2) 


(3) 


Every point of Ey is a limit point of points of Fg*. Let x1, x2, - - - be a se- 
quence of points of Fg tending to x. On account of (2) for an arbitrary e 
there exists H(x;, r—«) about each x; for which 

L,[A’H (xi, r — ©] ” 


+ >1-— 
(4) 


For ¢ fixed and sufficiently large H(x;, r—) is interior to H (x, r). From this 
and (4) we have 


(5) L,[A’H(x, r)] = Li [A’H(«i, r — €)] > — Anh. 
But for ¢ sufficiently small /,’~* is arbitrarily near to 4, which makes (5) and 
(3) contradictory. We conclude, therefore, that if A is regular A’ is regular. 


The proof for the case when A is irregular is along the same lines, and we 
merely sketch it. Except for a null set 


A’ = Ae + At + CH. 


Since A¢* is separated from C,+ it readily follows that A’ is irregular at each 
point of A°. Also, since for any H(a+, r), 


Li [(Ac + C#)H(at, r)] = Li[AtH (at, 


it follows that A’ is irregular at each point of Act. Let E be the part of C at 
which A’ is regular. Let F be the part of Ag for which 


L,[A’H(f, r)] < 
hit 


for an infinite set of arbitrarily small r. Let G be the part of E;*+ at which, 
for all r <6, 


(1) 


[July 


1933] SETS IN n-DIMENSIONAL SPACE 


L.[A’H(g, 
hit 


Then for Fg*, (1) holds, and for G;*, (2) holds. It is now possible to take a 
point x of Fgt, and a sequence of points x, %2, - - - of Gy tending to x, and 
arrive at a contradiction, as in the case when A was regular. 

6. Some applications. It was shown in the introduction that correspond- 
ing to a set A of linear extent there was a measurable set A which contained 
A, and for which 


(2) 


L(A) = L(A). 


It can likewise be shown that there is a measurable set A similarly related 
to any set A of extent k. We are now in a position to discuss the density 
properties of this set A. 

If A is regular (irregular) then A is regular (irregular). 

To prove this, set A=A+B where L,(B)>0. If Li(B) =0 the case is 
trivial. The set B is not separated from A. For then we would have 


L,(A) = Li(A) + Li(B) 


which cannot hold, since = L(A). Hence almost all B belongs to B,t. 
Now suppose that A is regular but that there is a part of B,j* with outer 
measure greater than zero at which A is irregular. Let E be the part of B,+ 
at which 


Ly|AH(e, r 
hit 


for a sequence of values of r tending to zero, and L;(Z) >0. Since E belongs 
to B,t, each point of Eis a point of E,4*. Let F be the part of Ag which is such 
that 


L,/AH(f, r)] > 


(1) 


for r<6, and L;(F) >0. Each point of F is a point of Fj*. About a point x of 


E;* put a hypersphere H(x, r) with r<6, and such that 
Ly|AH (x, 
(2) <1i-7 
k 


Let x; be a sequence of values of Fst tending to x. About each x; put a hyper- 
sphere H(x;, r—). Then, on account of (1), we have 
L,[AH(xi, 


3 
(3) 


645 


646 R. L. JEFFERY [July 


For any ¢, i can be taken large enough to insure that H(x;, r—e) is interior 
to H(x, r). But for n fixed, ¢ can be taken arbitrarily small, which, with (2), 
(3), and the fact that for « small 4,’~* is near to hj, leads to a contradiction. 
We conclude, therefore, that if A is regular then A is regular at almost all 
points. Similar reasoning shows that if A is irregular then A is irregular at 
almost all points. 

If A is any plane set of linear extent then the sets A and A’ are linearly 
measurable, and are regular or irregular according as A is regular or irregular. 
Besicovitch has shown that linearly measurable regular plane sets have a 
tangent at almost all points. Hence if A is regular there exists a tangent at 
almost all points of A, and of A’. And since each of these sets contains A 
it follows that A has a tangent at almost all points. Likewise the other the- 
orems which Besicovitch has proved for linearly measurable plane sets are 
seen to hold for general sets of linear extent. 

In proving that a regular linearly measurable plane set A has a tangent 
at almost every point, Besicovitcht makes use of the set A, which is the part 
of A for which 


L,|AH(a, r)] 1 
hy’ 


for r<6. He assumes that A, is linearly measurable. The measurability of 
this set can hardly be considered as obvious, and there seems to be no trivial 
proof for his assertion. We shall establish some general results from which 
the measurability of A, follows. 

We show first that 

Separated divisions of measurable sets are measurable. 

Let A be any measurable set of extent k, A; and A: separated divisions 
of A. Since A; and A, are separated, C, contains at most a null part of Aa, 
and C; contains at most a null part of A:. Hence, except for at most a null 
set, C; and C, belong to C. But since A is measurable, Theorem XII gives 
L,(C) =0. But this makes Li(C:) = L:(C2) =0, which again by Theorem XII 
makes A, and A, measurable. 

Let A be any set of extent k. Let B be the part of A for which 


L,{AH(a, r)] 
hit 


(1) i-y 


for r<6é. Then the sets B and E=A —B are separated. 


Tt Loc. cit., p. 438, 


1933] SETS IN n-DIMENSIONAL SPACE 647 


Suppose exists with L,(Bs+)>0. Then, by Theorem VIII, 
= L,(Bzs+) >0. For each point x of Eg* there exists some r <6 for which 
L,|AH(x, r)| <1 


(2) 


— 7. 


Take a sequence of points x1, %2,- - - of Bgt tending to x, and about each x; 
put a hypersphere H(x;, r—e). Then from (1) we have 


(3) 


But, for every ¢, i can be taken so large that H(x;, r—) is interior to H(z, r). 
Then, since ¢ can be taken arbitrarily small independent of 7, (2) and (3) are 
contradictory, which allows us to conclude that B and E=A —B are sepa- 
rated. 

It can likewise be shown that the part of A for which 


L,[AH(a, r)] 
hit 


i- Sit+n 

is separated from the remainder of A. From this it follows that the set A; of 
Besicovitch is separated from A— Aj. Then, since A is measurable, A; and 
A —A, are measurable. We note further that if A is not measurable the sets 
A, and A —A, are, nevertheless, separated. This fact permits the arguments 
of Besicovitch in regard to tangency to be carried through for any plane set 
of linear extent. 


ACADIA UNIVERSITY, 
WOLFVILLE, Nova Scotia 


| 
i 
i 


SUBHARMONIC FUNCTIONS AND 
MINIMAL SURFACES* 


BY 
E. F. BECKENBACH}{ anp TIBOR RADO 


INTRODUCTION 
0.1. Let f(w), given by 


f(w) = x(u, v) + iy(u, v), w = u + iv, win D, 
where D is some domain of definition, be an analytic function of the complex 
variable w. Then x(u, v), y(u, v) satisfy the Cauchy-Riemann differential 
equations 
(1) tu = Xo = — 
the subscripts denoting differentiation. These equations (1) are not sym- 
metric in x, y, but they imply the symmetric set 


(2) + ye = af + Luke + = 0. 
Conversely, (2) implies either (1) or 
(3) Yu = Xv, Yo = — Xu. 

From either (1), (2) or (3) it follows that x(u, v), y(u, v) are harmonic func- 
tions: 

Luu = 0, Vuu + = 0. 
If (1) holds, y(u, v) is said to be the conjugate harmonic function of x(u, 2), 
or if (3) holds, x(u, v) is said to be the conjugate harmonic function of y(, 2); 
generally, if (2) holds then x(u, v), y(u, v) will be called a couple of conjugate 
harmonic functions. 
0.2. Generalizing this situation to the case of three functions x(u, 2), 

y(u, v), 2(u, v), (wu, v) in D, we shall call x(u, v), y(u, v), 2(u, v) a triple of con- 
jugate harmonic functions provided the following conditions are satisfied: 


(i) E=G, F=0, 
where 

E= + ye tad, F = tute + + G = x? + + 2,7; 
(ii) x(u, v), y(u, 0), 


are harmonic. 


* Presented to the Society, December 29, 1932; received by the editors January 23, 1933. 
+ National Research Fellow. 


648 


SUBHARMONIC FUNCTIONS AND MINIMAL SURFACES 649 


It might be noted that if one of the coordinate functions vanishes iden- 
tically, say z=0, then (ii) is implied by (i); but in general this implication 
does not hold. 

0.3. While this generalization no doubt would be of interest from a 
purely analytic viewpoint, the following theorem of Weierstrass shows that it 
actually is very important geometrically: A necessary and sufficient condi- 
tion that a surface given in terms of isothermic parameters (that is, param- 
eters u, v such that E=G, F=0) be minimal is that the coordinate functions 
be harmonic. 

Thus the theory of minimal surfaces appears as the theory of triples of 
conjugate harmonic functions, while the theory of couples of conjugate har- 
monic functions is the theory of analytic functions of a complex variable. As 
a matter of fact, theorems and methods in theory of functions always have 
served as tools and models in the theory of minimal surfaces. 

0.4. The purpose of the present paper is the development of this analogy 
in the direction of the principle of the maximum. If f(w) is an analytic func- 
tion in a region R, then | f(w)| takes on its maximum on the boundary of R. 
Similarly, if x(u, v), y(u, v), z(u,v) form a triple of conjugate harmonic func- 
tions in R, then (x?+y?+27)1/* takes on its maximum on the boundary of R; 
this is easily shown to be true even if the three harmonic functions are not 
conjugate. However, the effectiveness of the principle of the maximum 
in the case of analytic functions depends essentially upon the fact that 
certain operations (multiplication for instance), if performed on analytic 
functions, yield analytic functions again. This situation does not seem to 
admit of any direct generalization to minimal surfaces. It is our purpose to 
show that despite this lack of direct analogy many important applications 
of the principle of the maximum can be generalized to minimal surfaces. Our 
tool is the following simple lemma (see §2): 


Three functions x(u, v), y(u, v), 2(u, v), continuous in a domain, form there 
a triple of conjugate harmonic functions if and only if log[(x+a)?+(y+6)? 
4-(z-+c)*]"/? is subharmonic for every choice of the real constants a, b, c. 


This lemma permits us to apply the theory of subharmonic functions, 
so important in theory of functions, to the theory: of minimal surfaces. For 
the convenience of the reader, we give in §1 the necessary definitions and facts 
concerning subharmonic functions. 


* See F. Riesz, Sur les fonctions subharmoniques et leur rapport a la théorie du potentiel (in two 
parts), Acta Mathematica, vol. 48 (1926), pp. 329-343, and vol. 54 (1930), pp. 321-360; P. Montel, 
Sur les fonctions convexes et les fonctions sousharmoniques, Journal de Mathématiques, (9), vol. 7 
(1928), pp.- 29-60; S. Saks, Sur une inégalité de la théorie des fonctions, Acta Szeged, vol. 4 (1928), 
pp. 51-55, and On subharmonic functions, Acta Szeged, vol. 5 (1932), pp. 187-193. 


650 E. F. BECKENBACH AND TIBOR RADO [July 


1. SUBHARMONIC FUNCTIONS AND FUNCTIONS OF CLASS PL 


1.1. In this section we present the definition of subharmonic functions 
and give those results concerning these functions which we shall need in the 
sequel. 

Let g(u, v) be a continuous function of two variables, defined in a domain 
D (connected open set). Suppose that for each point (uo, 20) of D we have 


1 2r 
(4) Yo) S g(uo + pcos v0 + psin 


for each sufficiently small value of the radius p. Then the function g(u, v) 
is said to be subharmonic in D.* 

The definition can be extended to the case of discontinuous functions, 
but we shall be concerned in this paper only with continuous subharmonic 
functions. 

1.2. It follows immediately from the definition that a subharmonic func- 
tion g(u, v) cannot attain its maximum value at any (interior) point of D, 
unless g(u, v) is identically constant. 

1.3. If a function g(u, v) has continuous partial derivatives of the second 
order, then a necessary and sufficient condition that g(u, v) be subharmonic 
is that its Laplacian be 20: 


Ag = t+ 80 2 
1.4. Let g(u, v) be subharmonic in the ring 
ri < [(u — uo)? + (v — < re, 
and let M(r) denote the maximum of g(u, v) on 
(u — uo)? + (v — = <r < re. 


Then M(r) is a convex function of log r.§ 
1.5. Obviously, if g(u, v) and h(u, v) are both subharmonic in D, then 
g(u, v) +h(u, v) also is subharmonic there. 


* This definition is due to F. Riesz. See Acta Mathematica, loc. cit., first part, p. 331. 

Tt See F. Riesz, Acta Mathematica, loc. cit., first part, p. 331. 

t See F. Riesz, Acta Mathematica, loc. cit., first part, p. 335. 

§ See P. Montel, Journal de Mathématiques, loc. cit., where this fact and similar elementary 
facts concerning subharmonic functions are presented in a systematic way. 


1933] SUBHARMONIC FUNCTIONS AND MINIMAL SURFACES 651 


1.6. A function p(u, v), defined in a domain D, will be said to be of class 
PL in D provided the following conditions are satisfied there. 
(i) p(u, v) is continuous. 


(ii) p(u, v) = 0. 


(iii) log p(u, v) is subharmonic in the part of D where p(u, v) >0. 

1.7. If p(u, v) is of class PL, then p(u, v) is subharmonic. Indeed, at 
points where p(u, v) =0 the condition (4) of Riesz obviously is satisfied; and 
elsewhere the fact that log p(u, v) is subharmonic implies that p(w, v) is sub- 
harmonic.* 

1.8. Obviously (see §1.5), the product of a finite number of functions of 
class PL, or any positive power of a function of this class, is again a function 
of class PL. 

1.9. The class PL is invariant under conformal mapping. (The same re- 
mark applies to the class of subharmonic functions.) That is, if p(u, v) is of 
class PL in D and if D is mapped conformally on a (U, V) domain D, then 
p(u, v) is transformed into a function g(U, V) which is of class PL in D. 

1.10. A necessary and sufficient condition that a non-negative function 
p(u, v) be of class PL is that e+ p(u, v) be subharmonic for every choice of 
the real constants a, 8.t It follows from this (see §1.5) that the sum of a finite 
number of functions of class PL is again a function of class PL. 

1.11. The classical example of a function of class PL is the absolute value 
of an analytic function f(w) of w=u+iv. If f(w) is different from zero in a 
domain, then log | f(w)| is harmonic there. Thus | f(w)| is just barely of class 
PL. As a consequence, a great number of theorems concerned with | f(w) | 
are a fortiori true for functions of class PL. We now shall state some of these 
generalized theorems which will be used in the sequel. The proofs run exactly 
in the same way as for | f(w)| ; for this reason we shall sketch just a few of the 
proofs, and otherwise shall give references to typical proofs concerning 
|f(w)|. 

1.12. Let p(u, v) be bounded and of class PL in u?+v?<1. Suppose 
p(u, v) remains continuous on a certain arc o of u?+v?=1, and vanishes 
there. Then p(u, v) =0. 

Proof.{ Choose the integer so large that 27/m is less than the length of 


* See P. Montel, Journal de Mathématiques, loc. cit., p. 39. 

t This criterion is due to Montel, Journal de Mathématiques, loc. cit., p. 40, who proved it under 
the assumption that p(u, v) has continuous partial derivatives of the first and second order. For the 
case of a merely continuous p(u, v), the theorem has been proved by T. Radé, Remarque sur les 
fonctions subharmoniques, Paris Comptes Rendus, vol. 186 (1928), pp. 346-348. 

t Cf. Pélya und Szegé, Aufgaben und Lehrsdtze aus der Analysis, Berlin, J. Springer, 1925, vol. 
I, p. 139, problem 279. 


652 E. F. BECKENBACH AND TIBOR RADO [July 


the arc o. If we rotate the unit circle about its center through an angle of 
2x/n, p(u, v) is transformed into a new function ,(u, v) of class PL (see 
§1.9). Let pa(u, v),- +--+, v) be the functions of class PL resulting 
from further successive rotations of the unit circle through the angle 27/n. 
Then v)=ppi-- pais again of class PL (see §1.8), and ¥(u, v)-0 
if (wu, v).converges to any point of u?+0?=1. Since y20, it follows from this 
(see §1.2) that ¥(u, v) =0. In particular, ¥(0, 0) = p(0, 0)*=0, that is.to say, 
p(u, v) vanishes at the origin. As any point of u?+v?<1 can be thrown, by 
conformal mapping of the unit circle upon itself, into the origin, it follows 
that p(u, v) =0. 

1.13. Let p(u, v) be bounded and of class PL in u*?+v?<1. Suppose 
p(u, v) vanishes in a subdomain k of u?+0?<1. Then p(u, v) =0. 

Proof. Consider any fixed point (uo, vo) of k. Then given any point (m, 
v1) in u?+v?<1 but not in k, there exists a circle passing through (wo, %), 
tangent to u?+v?=1 from within, and containing (m4, 2;) in its interior. The 
theorem of §1.12 applies to this circle. 

1.14. Let p(u, v) be bounded and of class PL in the angle 0 <arc tg (v/) 
<a. Let p(u, v) remain continuous on the ray u>0, »=0, and let p(u, 0)—0 
as u>-+0. Then in every angle 0 <arc tg (v/u) <a—@, where ¢ >0, we have 
p(u, v) as (uw, v)—>(0, 0) in any manner.* 

Of course this theorem is true if the domain of definition is only the sector 
0 <arc tg (v/u) <a, 0<u?+v?<r1,?; the proof is the same in either case. 

1.15. Let p(u, v) be bounded and of class PL in u?+v?<1. Let (u’, v’), 
(u’’, v’’) be two distinct points on u?+0?=1. Let (un, vn), (un’, vn’) be two 
sequences in u?+2?<1, converging to (u’,v’), (u’’, v’’) respectively, and let 
C, be a continuous arc, joining , ) and (u,.’, v4’), and comprised in the 
ring 1—€,<(u?+v)/?<1, where e, >0, and e,—0. Denote by 7, the maxi- 
mum of p(u, v) on C, and suppose that 7,0. Then p(w, v) =0.T 

1.16. Let p(u, v) be $1 and of class PL in r?=u?+v?<1. Let p(0, 0) =0 
and suppose that for a certain a>0, p(u, v)/r* remains bounded in 0<r<1. 
Then p(u, v) Sr*. If the equality holds for any (u, v), 0<u?+0?<1, then it 
holds identically.t 

Proof. Let M(r) denote the maximum of p(u, on u?+v?=r*. .Then 


* This generalizes a theorem of Lindeléf. Cf. Pélya und Szegé, loc. cit., p. 138, problem 277. 
The proof, given there for the special case when (1, v) is the absolute value of an analytic function, 
.applies without the change of a word to the general case considered above. 

t Cf. L. Bieberbach, Lehrbuch der Funktionentheorie, Berlin, B. G. Teubner, 1927, vol. II, pp. 
19-21. 

¢ This generalizes the Lemma of Schwarz. See C. Carathéodory, Conformal Representation, 
London, Cambridge University Press, 1932, p. 39. The example p(u, v) =(u?-+-v*)"/* shows that the 
value a=1 which holds for the Lemma ct Schwarz does not hold in the general case. 


1933] SUBHARMONIC FUNCTIONS AND MINIMAL SURFACES 653 


M(r)/r« is the maximum of p(u, v)/r* on u?+0? =r. Since log r* is harmonic, 
p(u, v)/r* is of class PL in 0<u?+v?<1. Therefore (see §1.4), M(r)/r* is a 
convex function of log r, — «© <log r<0. If such a function is bounded from 
above, then it is a non-decreasing function. Consequently, from 


lim,.1 M(r)/r s 1 
it follows that p(u, v)/r*<1, 0<r<1. If the equality holds for any (u, 2), 
0<u?+v?<1, then (see §1.2) it holds identically. 
2. A CHARACTERIZATION OF MINIMAL SURFACES 
2.1. If x(u, v), y(u, v), 2(u, v) form a triple of conjugate harmonic func- 
tions (see §0.2) in a domain D, then we shall say that the equations 
(S) x = x(u,v), y = y(u, v), 2 = 2(u, v), (u,v) in D, 


give a minimal surface in typical representation. In this statement, the term 
minimal.surface is used in a more general sense than is customary in dif- 
ferential geometry, where the condition EG—F*>0 is always required. In 
§4.1, we shall use the term minimal surface in an (apparently) even more 
general sense. 

If the equations (5) give a minimal surface Min typical representation, 
then the function (x?+-y?+-2?)'/? will be called the norm of M and will be 
denoted by | 22| or | M(u, v)| or | M(w)|, where w=u+iv. 

2.2. Let. 


(6) M:. x = x(u, v), y = y(u, v), = 2(u, v), (u,v) in D, 
be a minimal surface given in typical representation. Then 
(7) | MN | = (x? + y? + 22)1/? 


is of class PL. 
It is sufficient to consider points where | Dt| ~0 (see §1.6). At such points 
the Laplacian of log | M| is given by A log | M| =7/| M|*, with 


T= (t.2 + r,2)r? + (rr,)?], 
where f, tu, denote vectors, namely 
rt (x, 2), Ty (Ze, Vuy Sa), (Ze, Voy Se); 


and where the vector products indicated are scalar. The parameters being 
isothermic, we have 


r2Z =r? = i,t, = 0. 


Since the partial derivatives of the second order of log | Mt] are continuous 


654 E. F. BECKENBACH AND TIBOR RADO 
where | 2t| ~0, we have only to show (see §1.3) that 


(8) T=0. 


Fix (uo, 9); then two cases are possible; either \=0 or A>0. If \=0, then 
t. =r, =0 and (8) is trivial. If \>0, then the vectors r., tr, are both ~0 and 
are perpendicular to each other; let £ denote the unit vector perpendicular 
to each of them. Then we can write 


t= atu + bt, + 
where a, b, ¢ are scalars. Therefore 
r=ar.+ 
= a, = bh, 
T = + + c*) — + = 2dc? 20. 
2.3. The fact that (7) is of class PL certainly does not characterize mini- 


mal surfaces.* However, (6) is still a minimal surface given in typical repre- 
sentation if we shift the xyz-axes. Therefore, for the functions x(u, 2), 


y(u, v), 2(u, 2) in (6), 
_ [Ce + a)? + (y + 5)? + + 0)? 


is of class PL for arbitrary choice of the real constants a, b, c. And, as we now 
shall show, the converse also is true, so that we have the following 


Lemma. A necessary and sufficient condition that the continuous functions 
x(u, v), y(u, v), 2(u, v) represent a minimal surface given in typical representa- 
tion is that [(x+a)?+(y+6)?+(z+c)?]"/? be of class PL for arbitrary choice of 
the real constants a, b, c. 


2.4. The necessity has been proved above. To prove the sufficiency, ob- 
serve first that if (x?+?+27)"/? is of class PL, then x?+-y?+2? also is of class 
PL (see §1.8). Let then (uo, v9) be any fixed point of D, and put x(uo, v0) =o, 
y(uo, Vo) =o, 2(Uo, Yo) =Zo. Then if C denotes a sufficiently small circle with 
center at (uo, ¥) we have 


(xo + a)? + (yo + 5)? + (20 + c)? 


1 
+ a)? + (y +b)? + + 0) 


whence 


* See §2.6. 


C(t 


1933] SUBHARMONIC FUNCTIONS AND MINIMAL SURFACES 655 


The point (uo, vo) and the circle C being fixed, the right-hand member of this 
inequality is a linear function of the arbitrary real constants a, b, c. Thus (9) 
clearly implies that the coefficients of a, b, c vanish. That is to say, x(u, v) for 
instance has the property that, for every point (to, vo) in D, 


1 1 


1 2r 
x(Uo, V0) = f «(uo + pcos % + psin $)d¢, 
T/0 


for sufficiently small values of p. As is well known, this property character- 
izes harmonic functions.* Thus it follows that (u,v), y(u, v), 2(u, v) are har- 
monic functions. 

2.5. We proceed to show that E=G, F=0. Let r=(z, y, z), and let »=t 
+a, where a is an arbitrary constant vector. By assumption, then, (»?)'/? 
is of class PL so that (see §§$1.3 and 2.2) 


(10) (v2 + — 2[(ov,)? + = 0 


at points where »+0. At points where »=0, (10) clearly also holds (with 
the sign of equality). 

Consider a definite point (uo, %) in D. Then » regardless 
of the choice of the constant vector a. Choose first a =fu(uo, 00) —2(uo, Vo). 
Then =2u(%o, 0), and (10) gives that 

EG — E* — 2F?20 
at the point (uo, 9). Choose secondly a 0) —£ (to, Vo). (10) gives 
EG — G*? — 2F? 20 
at (uo, v0). Addition gives 
(E—G)?— 4F?20 
and consequently E=G, F=0 at (uo, v0). Since (wo, 70) was any point in D, 


the lemma of §2.3 is proved. 
2.6. The following remark might help explain the situation. 


* See for instance O. D. Kellogg, Foundations of Potential Theory, Berlin, J. Springer, 1929, p. 
227. 


656 E. F. BECKENBACH AND TIBOR RADO [July 


If t=1x(u, v), (u, v) in D, is a minimal surface in typical representation, 
then E=t,? is of class PL in D.* 

It is clearly sufficient to consider the case when D is the interior of a 
circle. Then the components x(w, v), y(u, v), 2(u, v), which are harmonic, can 
be written in the form 


«= Rfi(w), = Rfo(w), 2 = Rfs(w), 
where f;(w), fe(w), fs(w) are single-valued analytic functions of w=«+iv in 
D. We have then 
— it, = fi, yu — ive = fd, Bu — = fd, 
and hence, on account of E=G, 
E=4( (fi |? +|f |? |. 


Thus E is the sum of three functions of class PL, and consequently (see §1.10) 
E is also of class PL. 
As an example, let us consider the surface of Ennepert (in typical represen- 


tation) 
x = 3u + 3uv? — 


y = — 30 — 3u*n + 0, 


2 = 3u? — 30°. 


Then 2, Yu, 2. are three harmonic functions, such that the sum of their squares 


is of class PL. Computation shows that x,, yu, 2, are not conjugate. Thus, in 
the lemma of §2.3, the parameters a, b, c are actually necessary, even if the given 
three functions are known to be harmonic. 


3. APPLICATIONS 
3.1. Let 


M: = x(u,v), y = y(u, v), = 2(u, v), iv = w,| w| <1, 
be a minimal surface given in typical representation, such that (0, 0) is 
carried into (0, 0, 0). If M is comprised in the unit sphere, x?+y?+2?<1, 
then 
(11) | M(w)| w|,0<|w| <1, 


and 
1/2 


(12) Eo Ss 1, 


* In a subsequent paper, Subharmonic functions and surfaces of negative curvature, in the present 
number of these Transactions, we point out that if a surface is given in typical representation, then 
E=r1? is of class PL if and only if the Gauss curvature of the surface is <0. 

t See G. Darboux, Théorie Générale des Surfaces, Paris, 1887, vol. I, pp. 372-376. 


1933] SUBHARMONIC FUNCTIONS AND MINIMAL SURFACES 657 


where E,'/? denotes the length deformation ratio at the origin. The equalities 
hold if and only if M is a simply-covered circular disc with unit radius.* 
Proof. Since E=G, F=0, we have 


1/2 


(13) lim | M(w)| /| w| = Zo 
w—0 


and therefore | Dt(w)|/|w| remains bounded in 0<|w| <1. Consequently in 
0<|w| <1 we can apply §1.16, with a=1, to the function p(u, v) =| M(w)|. 
This gives (11), and then (13) yields (12). 

If we define | M(w)|/|w] =o? for w=0, then both (11) and (12) are 
contained in | M(w)|/|w| <1, |w| <1. If then | M(w)| /|w| =1 for any win 
|w| <1, then (see §1.2) the equality is an identity, | Mt|?=u?+0*. Differen- 
tiation gives 

= u, = 2, 
Wau + tf = 1, Wo» + tf 


whence addition gives E=G=1 throughout. Therefore the area of the mini- 
mal surface is 


A= f f (EG — = 
It follows from this situation that I is a simply-covered circular disc. 
3.2. Let 


M: x = x(u,v), y = y(u, v), 2 = 2(u, v), uw? + 0? <1, 


be a minimal surface given in typical representation, and let | M| be bound- 
ed. Suppose x(u, v), y(u, v), 2(w, v) remain continuous on a certain arc o of 
u?+v?=1, and x(u, v)=const.=xo, y(u, v) =const. =o, 2(u, v) =const. 
there. Then x(u, v) y(u, 2(u, 0) =20.t 

Proof. Apply §1.12 to the function 


p(u, v) = [(x(u, v) — x0)? + (y(u, — yo)? + (2(u, v) — 20)?] #2. 
3.3. Let 
M: x(u,v), y = y(u, v), 2 = 2(u, v), < arc tg(v/u) <a, 


* This generalizes the Lemma of Schwarz. Cf. C. Carathéodory, Conformal Representation, p. 39. 

Tt See E. F. Beckenbach, The area and boundary of minimal surfaces, Annals of Mathematics, 
(2), vol. 33 (1932), pp. 658-664. 

t See T. Rad6, Some remarks on the problem of Plateau, Proceedings of the National Academy 
of Sciences, vol. 16 (1930), pp. 242-248; J. Douglas, Solution of the problem of Plateau, these Trans- 
actions, vol. 33 (1931), pp. 262-321. 


658 E. F. BECKENBACH AND TIBOR RADO [July 


be a minimal surface given in typical representation, and let | M| be bound- 
ed. Let further x(u, v), y(u, v), z(u, v) remain continuous on the ray u>0, 
v=0, and let x(u, 0)—>x0, y(u, 0)—>yo, 2(u, 0) as u-> +0. Then in every 
angle 

v 

0 < arc tg — < a—a, where o > 0, 

u 

we have x(u, v) 2x0, y(u, v) yo, 2(u, ¥)—>Z0 as (u, v)—>(0, 0) in any manner.* 
Proof. Apply §1.14 to the function 


p(u,v) = [(x(u, v) — x0)? + (y(u, 0) — yo)* + (2(u, v) — 20)?]*/?. 


As in §1.14, the theorem is true if the domain of definition is only the sector 
0 <arc tg (v/u) <a, 0<u?+v0? <1”. 

3.4. Besides the assumptions of §3.3, suppose x(u, v), y(u, v), 2(u, 2) 
remain continuous on the ray arc tg (v/u) =a, u®+v?>0, and let x(u, v) x, 
y(u, 1), 2(u, v)—>21 as (u, v)—>(0, 0) along the ray arc tg (v/u) =a. Then 
Xo=X1, Zo=%, and x(u, v) =m, y(u, V)—Yo= 1, 2(U, as 
(u, v)—>(0, 0) in any manner in the angle 0 <arc tg (v/u) <a. 

Proof. Apply §3.3 to the angles 


v 3a a v 
0 < arc tg — < — and — < arctg—<a 
u 4 u 


and compare results. As before, the theorem is still true if the domain of defi- 
nition is only the sector 


v 
u 


3.5. The preceding result yields a new proof of the following lemma, 
used by J. Douglas in his work on the problem of Plateau.t 

Let the integrable functions £(¢), n(¢), ¢(@), substituted in the Poisson 
integral formula, determine the (harmonic) coordinate functions of a minimal 
surface 


M: x = x(u,v), y = y(u, v), 2 = 2(u, v), w+? <1, 
in typical representation. Let further &(¢), n(¢), ¢(@) approach definite 


limit values n_(1), and &,(2), according as in 
clockwise and counterclockwise senses respectively. Then 


(14) = & (x), = = 


* This generalizes a theorem of Lindeléf. Cf. Pélya und Szegi, loc. cit. 
t J. Douglas, loc. cit., pp. 304-306. 


1933] SUBHARMONIC FUNCTIONS AND MINIMAL SURFACES 659 


Proof. It is a well known property of the Poisson integral that, because 
of the specified nature of the discontinuity of &(@) at ¢=7, the function 
x(u, 0) approaches a definite limit if (uw, v)—+(—1, 0) along any straight line 
in u?+v? <1, this limit being a linear function of the angle from the u-axis 
to the straight line and varying from ¢_(7) to £,(7) as the angle varies from 
—1/2 to x/2. Similar statements hold for y(u, v), z(u,v). But if we join two 
such straight lines by a circular arc lying in u?+-v? <1, we obtain a sector for 
which §3.4 applies; consequently, (x, y, z)— a definite (%o, yo, 20) which does 
not vary with the angle. That is, the linear functions mentioned above are 
constants, whence (14). 

3.6. Let 


M: w« = x(u, v), y = y(u, v), 2 = 2(u, v), (uw, v) interior to R, 


where R is a Jordan region,* be a minimal surface given in typical representa- 
tion, and let | §¢| be bounded. Let further x(u, v), y(u, v), 2(u, v) remain con- 
tinuous on the boundary of R except possibly at a single point (mo, vo), and 
let (x, y, 2)—>(x0, Yo, 20) and (x, y, z)—>(4%1, 91, 21) as (uv, v) converges on the 
boundary to (uo, %) from one side and the other respectively. Then (xo, yo, 
30) =(%1, M1, %) and a(u, y(u, as 
(u, 0) Yo) in any manner in R. 

The proof follows immediately from §3.4 by conformal mapping. It can 
be obtained also by following step by step the proof, for the absolute value of 


an analytic function of a complex variable, based on the rotation-method.f 


4. ON CONFORMAL MAPS OF MINIMAL SURFACES 


4.1. The most general definition (actually used in the literature) of a 
minimal surface is as follows.f 
A set of equations 


where R denotes a Jordan region,* defines a continuous surface S of the topo- 
logical type of the circular disc, if §(a, B), n(a, 8), ¢(a, B) are continuous in R. 

The surface (15) is a minimal surface if the following condition is satisfied. 
Given any point (ao, Bo) interior to R, there exists a vicinity Vo of (ao, Bo) and 
a topological transformation 4=a&(a, 8), B=B(a, B) of Vo, such that &(a, 8), 
n(a, B), ¢(a, 8) are transformed into functions £(a, 8), 4(a, 8), €(a, 8) which 
form a triple of conjugate harmonic functions in the image Vo of V (see 
§0.2). Such parameters a, B are called local typical parameters. 

* That is, the set of points in and on a Jordan curve. 


t Cf. C. Carathéodory, Conformal Representation, pp. 21-24. 
t See T. Rad6, Contributions to the theory of minimal surfaces, Acta Szeged, vol. 9 (1932), p. 9. 


(15) E(a, B), n(a, 8), f(a, B), (a, B) in Kk, 


660 E. F. BECKENBACH AND TIBOR RADO [July 


4.2. According to the fundamental theorem in the theory of uniformiza- 
tion,t a minimal surface in the general sense defined above admits also of 
typical parameters in the large, in the following sense. If 


is a minimal surface, in the sense of §4.1, then there exists a topological trans- 
formation 


(17) u=u(a,B8), v= v(a,8), (a, B) interior to R, 


a=a(u,v), B= B(u,v), u?+v? <1, 


of the interior of R into u?+v?<1, such that n(a, B), F(a, B) 
are carried into three functions 


a(u,v) = E(a(u, v), B(u, v)), y(u, v) = n(a(u, v), B(u, 
a(u,v) = §(a(u, v), B(u, v)) 


which form a triple of conjugate harmonic functions in u?+-0?<1. Our pur- 
pose in this section is to study the situation on the boundary. 

4.3. Using the same notations as in the preceding paragraph, §4.2, sup- 
pose that the functions &(a, B), n(a, B), ¢(a, B) in (16) do not all three reduce 
to constants on any arc of the boundary of R. 

Then the transformation (17) remains continuous and one-to-one on the 
boundaries. As a consequence, the functions x(u, v), y(u, v), 2(u, v) im (18) re- 
main continuous on u?+v?=1. 

4.4, The preceding assertion will be established if we disprove the fol- 
lowing two possibilities. 

(i) Suppose there exist in the interior of R two sequences (a,’, B,'), 
(a’, Bi’) converging to the same point (ao, 8o) on the boundary of R, such 
that the corresponding sequences (wu, , Un), (un’, vn’) converge to two dis- 
tinct points (ud, vi), (ud’, v0’) on u?+v?=1. Denote then by /, an arc in 
the interior of R, connecting (a,’, 8,’) and (a,/’, 8,’ ), such that /, converges 
to (ao, Bo); and denote by C,, the image of /, in u?+v? <1. Then the theorem 
of §1.15 applies to the function 


p(u, v) = [(x(u,v) — E(a0, Bo))® + (y(u, 2) — + (2(u, 0) — 


(18) 


and it follows that p(u, v) vanishes identically. Hence x(u, v), y(u, 2), 
2(u, v) and consequently £(a, 8), n(a, 8), ¢(a, B) all reduce to constants. This 
contradicts the assumption stated in §4.3. 

(ii) Denote by (a1, 8;), (a2, 82) any two distinct points on the boundary of 


t See C. Carathéodory, Conformal Representation, chapter VII, and also the bibliographical 
notes given there on p. 105. 


(16) S: = &(a, B), y = n(a, B), z = $(a, B), (a, B) in R, 


1933] SUBHARMONIC FUNCTIONS AND MINIMAL SURFACES 661 


R, and by C a Jordan arc in the interior of R connecting (a, 6;) and (ae, Be). 
On account of the preceding result, the image C* of C is a Jordan arc in 
w+v?<1 with definite end points on u?+v?=1. We have to disprove the 
possibility that these end points coincide. Suppose they do coincide. Then 
C* is actually a closed Jordan curve, which has a unique point (wo, vo) in 
common with u?+v?=1. Denote by D* the interior of C*. Then x(u, »), 
y(u, v), 2(u, v) satisfy in D* the assumptions of §3.6. Hence x(u, v), y(u, 2), 
z(u, v) converge to definite limits xo, yo, 20 if (uw, v) converges to (#0, v9) from 
within D*. 

D* is the image of a domain D in R which is bounded by C and by a cer- 
tain arc o of the boundary of R. If (a, 8) converges, from within D, to any 
point of then (u, v) converges to from within D*. Hence 


t(a, 8) = a(u, v) Xo, 8) = y(u, v)— Yo, f(a, B) = 2(u, v) — Zo. 


That is to say, (a, B), n(a, B), ¢(a, 8) all three reduce to constants on g, in 
contradiction with the assumption made in §4.3. 

4.5. We mention the following two special cases of the theorem of §4.3. 
Suppose that (a, 8B) =a, n(a, 8) =, f(a, 8) =0 in the Jordan region R. Then 
the assumptions of §§4.2 and 4.3 obviously are satisfied and the theorem of 
§4.3 reduces to the so-called Osgood-Carathéodory theorem: If the interior of 
a Jordan region R is mapped in a one-to-one and conformal way upon u?+v? 
<1,the map remains continuous and one-to-one on the boundary of R.j 

4.6. Suppose next that the equations (16) carry the boundary of R in a 
topological way into a Jordan curve I’. In this case we say that the surface 
S is bounded by I. The theorem of §4.3 implies then the following result. 
A minimal surface S (in the general sense of §4.1), bounded by a Jordan curve 
I’, admits of a representation 


x = a(u,0), y(u, 0), 2 = 2(u, 0), w+ 0? 1, 


with the following properties: 

(i) x(u, v), y(u, v), 2(u, v) form a triple of conjugate harmonic functions in 
u?+v? <1; 

(ii) x(u, v), y(u, v), 2(u, v) are continuous in u?+0? <1, and the equations 
(19) carry u?+0?=1 in a topological way into the Jordan curvel. 

By way of explanation, let us recall that a Jordan curve might bound 
several minimal surfaces, as follows from classical examples. The preceding 
result expresses a property common to all these minimal surfaces. 


T See C. Carathéodory, Conformal Representation, chapter VI. 


Outro STATE UNIVERSITY, 
CoLumsus, 


SUBHARMONIC FUNCTIONS AND SURFACES 
OF NEGATIVE CURVATURE* 


BY 
E. F. BECKENBACHft AND T. RADO 


INTRODUCTION 
0.1. Given a piece of surface in general parametric representation 
x = x(u,v), y = y(u, v), = 2(u, 2), 
the Gauss curvature K of the surface is given by the familiar formulat 
(— + — (Fu — 
= (F, — 3G.) E F 
F G 


where E, F, G are the first fundamental quantities: 
E= + yd +22, F = tute + + 2u20, G = x72 + + 37, 


and W*= EG — F’. As is usual in differential geometry, we assume throughout 
this paper that W ~0 for the representations to be considered. 

Suppose now that the surface is given in an isothermic representation; 
that is to say, suppose that E=G, F=0. Put E=G=X(u, v). The assumption 
W 0 is then equivalent to \(u, v)>0. The above formula for K reduces to 
the form 


1 
K = +A? — Ad), 
aa + ) 


where A is the symbol 


0.2. By computation it follows that 


* Presented to the Society, April 14, 1933, under the title On the isoperimetric inequality; received 
by the editors February 16, 1933. 

t National Research Fellow. 

t See W. Blaschke, Differentialgeemetrie, Berlin, J. Springer, 1930, p. 93. 


662 


0 iG, 
| 4E, E F | 
3G, F G 
0? 
ou? + Ov? 
| | 


SURFACES OF NEGATIVE CURVATURE 


MAA — (A? + AZ) 


A log \ = 


Hence, we have the formula* 


1 K = —— Alogi. 
(1) g 


Consequently, if K <0 on our surface, then Alogy=0, that is to say, log X is 
subharmonic in the terminology of F. Riesz.} Conversely, if log \ is subharmonic, 
then K <0 on the surface. 

This relation between subharmonic functions and surfaces of negative 
curvature suggests geometrical applications of the theory of subharmonic 
functions. On the other hand, the geometrical interpretation suggests ques- 
tions concerning subharmonic functions. The purpose of this paper is to 
present a few results which we have obtained in this way. 

0.3. One of our geometrical results is concerned with the isoperimetric 
inequality. Among all simply-connected plane regions whose boundaries are 
rectifiable and have a given length /, the circle has the maximum area. This 
fact may also be stated as follows:if a is the area and / the length of the bound- 
ary of a simply-connected plane region, then a and / satisfy the isoperimetric 
inequality a</*/(47). Carleman proved that this same inequality holds for 
every simply-connected rectifiable piece of a minimal surface.{ We shall prove 
that the isoperimetric inequality holds for every simply-connected recti- 
fiable piece of every surface whose Gauss curvature K is <0. This generaliza- 
tion is, in a way, final§; indeed, it is almost trivial (cf. §2.7) that if a surface 
has the property that every simply-connected piece on it satisfies the 
isoperimetric inequality, then K <0 on the surface. 

We shall make in our work the assumption, customary in differential 


* See for instance A. R. Forsyth, Differential Geometry, Cambridge University Press, 1912, 
p. 84. 

t See F. Riesz, Sur les fonctions subharmoniques et leur rapport a la théorie du potentiel (in two 
parts), Acta Mathematica, vol. 48 (1926), pp. 329-343, and vol. 54 (1930), pp. 321-360. We shall 
confine ourselves to the case of continuous subharmonic functions, though Riesz defines them more 
broadly. For a systematic treatment of the elementary properties of these functions, see P. Montel, 
Sur les fonctions convexes et les fonctions sousharmoniques, Journal de Mathématiques, (9), vol. 7 
(1928), pp. 29-60. 

} T. Carleman, Zur Theorie der Minimalflichen, Mathematische Zeitschrift, vol. 9 (1921), pp. 
154-160. 

§ That is, it is final in so far as the case of surfaces with K $0 is concerned. It is known, however, 
that for convex regions on a sphere with K=Ko>0 we have aX (/?+Koa*)/(4r). See F. Bernstein, 
Uber die isoperimetrische Eigenschaft des Kreises auf der Kugeloberfliche und in der Ebene, Mathe- 
matische Annalen, vol. 60 (1905), pp. 117-136. There are indications that perhaps the same in- 
equality holds for simply-connected rectifiable regions on surfaces of constant negative curvature, 
K=Ko<0. The question arises then as to whether or not for all real Ko the inequality aS (P?+Koa*) 
/(4r) characterizes surfaces with variable curvature K S Ko. 


663 


664 E. F. BECKENBACH AND TIBOR RADO [July 


geometry, that the surfaces and curves to be considered are analytic. This 
obviously unnecessary assumption serves the twofold purpose of avoiding 
certain unessential complications which would obscure the unity and sim- 
plicity of the method, and of dodging certain essential difficulties which 
seem to require a thorough and presumably interesting study. 

Besides the isoperimetric inequality, we shall discuss briefly a few the- 
orems which have been first proved for conformal maps of plane regions, have 
then been extended to conformal maps of minimal surfaces, and will be shown 
in this paper to hold for conformal maps of surfaces with K <0. 

0.4. The following notation will simplify our next statement. If g(u, v) is 
a continuous function in a domain D, we shall put 


1 
A(g; Uo, p) = —ff g(uo + £, v0 + n)dédn, 
p 


T 


1 
L(g; to, %0; p) = g(uo + pcos + psin 
0 


where (wo, %o) is the center and p is the radius of a circular disc x: (u—w)? 
+(v—v)? Sp? which is comprised in D. 

An important inequality, due to Carleman,* can be stated then as follows: 
if {(w) is an analytic function of w=u+iv in D, then 


(2) “[AC| tuo, Vo; p) < ; Uo, Yo; p) 


for every circular disc comprised in D. We shall show that a function g(x, 2), 
continuous and 20 in D, satisfies the inequality 


(3) [A(g?; wo, v0; L(g; wo, v0; p) 


for every circular disc comprised in D, if and only if log g(u, v) is subharmonic 
in the part of D where g(u, v) >0. 

We shall use in this paper, as we did in a previous one,} the term function 
of class PL, meaning a function g(u, v) continuous and 20, and such that 
log g(u, v) is subharmonic wherever g(u, v) >0. Then the above inequality (3) 
expresses a characteristic property of functions of class PL. On account of 
the formula (1) for K, this analytic fact is then readily seen to be equivalent 
to the geometric fact that the isoperimetric inequality is characteristic for 
surfaces with negative curvature (as explained in §0.3). 

0.5. It is natural to ask what happens if we replace in (3) the exponents 2 
and 1/2 by 6 and 1/8 respectively, where 8 is any real number. At the end 
of §1 we shall make a few very incomplete remarks concerning this question. 


* Mathematische Zeitschrift, loc. cit. 
t Subharmonic functions and minimal surfaces, these Transactions, vol. 35 (1933), pp. 648-661. 


SURFACES OF NEGATIVE CURVATURE 


1. A CHARACTERIZATION OF FUNCTIONS OF CLASS PL* 


1.1. The familiar example of a function of class PL is the absolute value 
of an analytic function f(w) of the complex variable w=u-+-iv. Indeed, as 
is well known, log |f(w)| is a harmonic function of u and 2, that is to say, 
A log | f(w)| =0. 

We have the following theorem, due to Carleman.t 

If f(w) is continuous in the unit circle |\w|<1 and analytic in |w| <1, then 


(4) Sf =| J ple) | as). 


The sign of equality in (4) holds if and only if Siw) = F'(w), where F(w) is a 


linear function 
aw+b 


cw+d 
regular in |w|<1. 

1.2. If we write the inequality (4) of Carleman as we did in 0.4, then there 
arises the following question. Given a domain D in the uv-plane, we ask for all 
functions g(u, v), which are continuous and 20 in D, and satisfy the ine- 
quality (3) for every point (wo, 79) in D and for every p such that the circular 
disc (uw —1)*+-(v—v)* <p? is comprised in D. We shall prove the following 


Lemma. A function g(u, v), continuous and =0 in a domain D, satisfies the 
inequality 
(5) [A(g?; 0, v0; p) ]*? S L(g; mo, 0; p) 
for every point (uo, vo) in D and for every p, such that the circular disc (u—uo)* 
+(v—v0)* Sp? is comprised in D, if and only if g(u, v) is of class PL in D. 

1.3. Let us first prove that if g(u, v) is of class PL in D, then the inequality 


(5) is satisfied. Suppose first that g(u, v) >O in D. Consider any circular disc x, 
comprised in D, with center (mo, v9) and radius p. Denote by C the perimeter 


of x. Put 
log g(u, v) o(u, v); 


then by assumption ¢(u, v) is subharmonic. Let h(u, v) be the harmonic func- 
tion in x coinciding with ¢(u, v) on C. Then ¢(u, v) <h(u, v) in xf, that is, 
g(u, v) Se*™-” in x. Consequently, 


* For the precise definition and a discussion of elementary facts concerning these functions, 


see the authors’ paper just cited. 
{ Mathematische Zeitschrift, loc. cit. 
t This is a general relation between a subharmonic function and a dominating harmonic func- 


tion. See F, Riesz, Acta Mathematica, loc. cit., first part, p. 331. 


1933] 665 

| 

| 

| 


666 E. F. BECKENBACH AND TIBOR RADO 
(6) [A (g?; 10, v0; p) S [A(e?*; uo, v0; p)]*/*. 
Also, g(u, v) =e*™-” on C, so that 

(7) L(e*; uo, 00; p) = L(g; uo, v0; p). 


Let h*(u, v) be the conjugate harmonic function of h(u, v). Then f(w) 
is an analytic function of w=u+iv, and |f(w) By Carle- 
man’s inequality (2) then 


(8) [A (e2*; ao, v0; p)]*/2 < L(e*; v0; p). 


(5) follows from (6), (7), and (8). 

Suppose now only that the function g(u, v) of class PL is >0 in D. Con- 
sider g(u, v) +, where € is a constant >0. Then g(u, v)+¢€is >0 and of class 
PL.} Accordingly, the above discussion can be applied to g(u, v)+€, so that 
(5) holds for this function. As g(u, v) is the uniform limit of g(u, v)+e as 
e—0, we have (5) for a general g(u, v) of class PL. 

1.4. We shall show now that if g(u, v) is a non-negative function defined 
and continuous in D, and if for every circular disc x comprised in D, the in- 
equality (5) holds, then g(u, v) is of class PL. 

Suppose first that g(u, v) has continuous derivatives of the first and second 
order, and let these derivatives be denoted by their standard symbols 9, q, r, 
s, t. We assume for convenience that the point (wo, 00) under discussion is (0, 
0) and denote by po, etc., the value of #, etc., at (0, 0). Finally we shall denote 
by o; certain quantities such that o;/p?—0 as p—0, where p?=u?+1?. 

We have then, by the finite Taylor expansion, 


g(u, v) = got pow + gov + (rou? + 2squv + tov?) + 

£0 + (po cos + qosin ¢)p 

+ 3(ro cos? + 259 cos sin + to sin? + a4, 
g(u, v)? = + 2go(po cos + go sin ¢)p 

+ [(po? + goro) + 2(pogo + goso) cos sin > 


+ + goto) sin? ¢]p? + o2, 
so that 


1 
L(g; 0, 0; p) = f g(p cos p sin = go + + to) + a3, 
T/0 


[L(g; 0, 0; p)]? = g& + 4p%go(ro + to) + a4, 


+ For the fact that the sum of two functions of class PL is again a function of class PL, see the 
authors’ paper in vol. 35 of these Transactions, pp. 648-661, §1.10. 


SURFACES OF NEGATIVE CURVATURE 


1 2r 
A(g?; 0, 0; p) = rdr g(r cos ¢, sin 
Tp’ 0 0 
+ + + gol(ro + to)] + 9s. 


By assumption, then, 


ge + + + golro + + 05 S gc? + 40% + bo) + 04, 


(pe + — golro + to) S — o5)/p?. 


The right-hand member of this last inequality —0 as p->0, so that the 
left-hand member is <0. Since any point of D can be taken as (uo, v9), we 


have then 
g(r +t) — +97) 20 


in D. Hence g(u, v) is of class PL, since by computation (cf. §0.2) 
g(r + t)— (p? + 


A log g 2 


wherever g>0. 
1.5. Suppose now7 that g(u, v) has continuous derivatives of only the 
first order, but otherwise satisfies the conditions of §1.4. For a small fixed 


7>0, put 


1 
g(u,v; 7) = —ff g(u + &,0 + n)dédn. 


ar? 


(Of course g(u, v; 7) can be defined thus for only a subdomain D’ of D, but 
this is of no consequence since 7 is arbitrarily small.) That this function 
g(u, v; rT) also satisfies (5) follows from Minkowski’s inequality.{ Furthermore 
g(u, v; 7) has continuous derivatives of the second order.§ Hence g(u, v; 7) 


+ The assumptions of §1.4 are sufficient for the applications to differential geometry which we 
shall make in §2, so that the reader interested primarily in those applications can omit §1.5 and §1.6 
without loss of continuity in the discussion. 

t The necessary inequality follows, by a familiar passage to the limit, from the inequality 


(£(E*)) 


which has the geometrical significance that the length of a polygonal line is at least as great as that 
of the line segment joining its end points. 

§ Concerning the properties and applications of this approximation by integral means, see E. 
Levi, Sopra una proprieta caratteristica delle funzione armoniche, Atti della Reale Accademia dei 
Lincei, vol. 18 (1909), pp. 10-15; H. E. Bray, Proof of a formula for an area, Bulletin of the American 
Mathematical Society, vol. 29 (1923), pp. 264-270; T. Rad6, Remarque sur les fonctions subhar- 
moniques, Paris Comptes Rendus, vol. 186, pp. 346-348; T. Rad6, Sur le calcul de Vaire des surfaces 
courbes, Fundamenta Mathematicae, vol. 10 (1927), pp. 197-210; F. Riesz, loc. cit., second part, pp. 
342-345. 


1933] 667 
or 
i 
i 


668 E. F. BECKENBACH AND TIBOR RADO [July 


satisfies all the conditions of §1.4 and so is of class PL. Since g(u, v; r)g(u, v) 
as T0 it follows that g(u, v) is of class PL. 

1.6. Suppose finally that g(u, v) is only continuous, but otherwise satisfies 
the conditions of §1.4. Then g(u, v; 7), defined as above, has continuous first 
derivatives}, and hence it satisfies the assumptions of §1.5. According to 
§1.5, g(u, v; 7) is of class PL and consequently its uniform limit g(u, 2) is of 
class PL. 

1.7. With regard to an application which we shall make in §2, we need a 
slight (and incomplete) discussion of the sign of equality in (5). Suppose that 
g(u, v) is continuous and positive in (#—1o)*+(v—v9)? <p? and that g(u, ») 
is of class PL in (u—uo)?+(v—v9)?<p?. Suppose that 


(9) [A(g*; wo, v0; = L(g; uo, 00; p). 
Then g(u, v) = |F’(w) |, where F(w) is a linear function 
aw+b 
cwt+d 


which is regular in | —wo|<p and which does not reduce to a constant. 
Indeed, if we go through the discussion in §1.3, we find that in order to 
have (9), we must have (with the notations of §1.3) 


g(u,») =| f(w)|, 


where f(w) satisfies the inequality (4) of Carleman with the sign of equality. 
On account of the theorem of Carleman, we have then f(w) = F’(w), where 
F(w) has the desired form. This F(w) cannot reduce to a constant at present, 
since then it would follow that g(u, v) =| F’(w) | =0, while we supposed that 
g(u, v) >0 throughout. 

1.8. The question arises as to the significance of the inequality (5) if we 
replace the exponent 2 by a general (real) exponent 8. The case 8 =1 can be 
settled easily; the reasoning used above in the case 8=2 applies directly.f 
There follows 

A function g(u, v), continuous in a domain D, satisfies there the inequality 


A(g; uo, V0; p) S L(g; uo, 003 p) 


for every point (uo, Vo) in D and for every p, such that the circular disc (u—1uo)* 

+(v—v)? <p? is comprised in D, if and only if g(u, v) is subharmonic in D. 
1.9. For values of 8 other than 1 and 2, the method of §1.4, §1.5, §1.6 

yields theorems whose statements vary according to the location of 8 with 


t See third footnote on p. 667. 
t Actually, though, there is a much simpler way of handling the case B= 1. 


1933] SURFACES OF NEGATIVE CURVATURE 669 


respect to the special values 0, 1, 2. By way of illustration, we mention the 
following statements. 

Suppose g(u, v) is continuous and =0 in a domain D. Suppose that for a 
certain exponent f the inequality 


(10) [A(g*; uo, v0; < L(g; uo, v0; p) 


holds for every point (uo, vo) in D and for every p such that the circular disc 
(u —uo)?-+ (v—v9)* is comprised in D. 

If 1<B8 <2, then it follows that g?-* is subharmonic in D. 

If B>2, it follows that 1/g°-* is superharmonic. 

In a general way, the greater 8, the stronger the inference will be as to the 
subharmonic character of g(u, v). For 8<1, g(u, v) need not be subharmonic. 

For 8 =1 and 6 =2 the inequality (10) has been shown, in what precedes, 
to be a necessary and sufficient criterion for a certain subharmonic property. 
An equally complete discussion for a general exponent might lead to interest- 
ing questions. 


2. APPLICATIONS TO SURFACES OF NEGATIVE CURVATURE 
2.1. Let there be given a piece of surface S in a representation 
(11) Sixz= »), y(u, v), 2(u, p’, 


with the following properties. 

(a) x(u,v), y(u, v), z(u,v) and their first partial derivatives are continuous 
in Sp?, 

(b) In u2+0?<p?, x(u, v), y(u, v), 2(u, v) have continuous partial deriva- 
tives of the third order. 

(c) The representation (11) is isothermic, that is to say, E=G, F =0, in 
u?+7?<p*. We put E=G=X(u, v). Then X20; but we suppose that A>0 in 
<p?, 

2.2. Lemma. If the Gauss curvature K of the surface S, given in a represen- 
tation as described in §2.1, is <0, then the area a and the perimeter | of S satisfy 
the inequality a<}*/(41). The sign of equality holds if and only if K=0, and 
S is a geodesic circle (that is to say, S is a developable and there exists a point O 
on S such that the geodesic distance of O from every point of the perimeter of S is 
the same). 


The proof is as follows. With the notation of §0.4 we have 
(12) a = 0, 0; p), = 0, 0; p). 


On account of the assumption K <0, the function A(wu, v) is of class PL (see 


670 E. F. BECKENBACH AND TIBOR RADO [July 


§0.2 and the definition of functions of class PL). Hence} the function 
d(u, v)*/? is also of class PL. From §1.2 it follows, therefore, for g= 1/2, that 


(13) [A(A; 0, 0; p)]/* LAY; 0, 0; p). 


The inequality a</*/(47) follows now immediately from (12) and (13). 
Suppose now that we have a=/?/(47). Then we must have the sign of 
equality in (13). Consequently (see §1.7) we have 


(14) A(u, v)/2 =| F’(w)|, 


where F(w) has the form (aw+b)/(cw+d), and F(w) is regular and not con- 
stant in |w|<p. Hence the equation w*=F(w) carries the circle |w|<p 
in a one-to-one and conformal way into a certain circular disc «* in the 
w* =u*+iv* plane. Introducing u*, v* as new parameters, we obtain the 
equations of S in the form 


(15) = &(u*, o*), y = n(u*, = v*), (u*, o*) in 


Since we passed from the isothermic parameters u, v to the new parameters 
u*, v* by a conformal map, it follows that u*, »* are also isothermic pa- 
rameters. Hence if we denote by E*, F*, G* the first fundamental quantities 
relative to the representation (15), we have E*=G*, F*=0. If we put 
E* =G* =*(u*, o*), then we have, by simple computation, 


dw 
— r | = | F'(w) | = 1, 


w* 


on account of (14). Hence E* =G* =1, F* =0. That is to say, the representa- 
tion (15) is an isometric map of S (every arc on S has the same length as its 
image). 

2.3. In order to apply the lemma of §2.2 to a given piece of surface, we have 
to represent the surface as required in §2.1. Thus it is necessary to refer to 
existence theorems on conformal mapping, and the validity of the isoperi- 
metric inequality a</*/(47) is made to depend upon the available results 
concerning the theory of conformal mapping. Since we are unable at this 
time to prove the most general statement which is likely to be true, we re- 
strict ourselves to the following theorem which might be considered as per- 
fectly general according to the usual standards in differential geometry. 


2.4. THEOREM. Let there be given an analytic surface in the xyz-space, that 
is to say, a surface which admits, in the vicinity of every one of its points, a 
representation x=x(u, v), y=y(u, 0), z=2(u, v), where x(u, v), y(u, 2(u, 


+ Every positive power of a function of class PL is again a function of class PL; see the authors’ 
paper in these Transactions, vol. 35, pp. 648-661, §1.8. 


1933] SURFACES OF NEGATIVE CURVATURE 671 


are analytic functions of u,v, and where EG—F*>0. Denote by K the Gauss 
curvature of the surface. Then K <0 is a necessary and sufficient condition that 
the area a and the perimeter | of every simply-connected portion, bounded by an 
analytic curve, of the surface satisfy the isoperimetric inequality a<l*/(47). 


2.5. To prove this theorem, suppose first that K <0. Let S be any simply- 
connected portion of the surface, bounded by an analytic curve. Take a 
simply-connected open portion S* of the surface, such that S is interior to 
S*. On account of general theorems, S* admits of an isothermic representa- 
tion 


S*i2= &(u*, v*), 7° n(u*, v*), o(u*, v*), u*? < 1, 


where E*=G*>0, F*=0, and é, n, ¢ are analytic functions of u*, v*. The 
portion S appears in this map as a Jordan region R* in u**+1** <1, bounded 
by an analytic Jordan curve C*. We map then R* in a one-to-one and con- 
formal way upon u*+2?<1; on account of the analyticity of C*, this map re- 
mains analytic on u?+-v?=1. Thus we obtain a map of S as required in §2.1, 
and then the lemma of §2.2 gives the desired inequality a </*/(47). 

2.6., Suppose, conversely, that we have a</*/(4m) for every portion S 
of a surface as described in §2.4. Take such a portion S. Applying the con- 
struction of §2.5, we obtain for S a representation 


(16) Six= a(u, y(u, v), 1, 


with the properties required in §2.1. If «:(«#—wo)*+(v—v0)?<p? is any cir- 
cular disc comprised in u?+v?<1, then there corresponds to x, by means of 
(16), a portion S» whose area do and perimeter J) satisfy by assumption the 
inequality ao </?/(47). If we use again the notation E=G=X(u, v), then 


a = (A; Uo, Vo; = Uo, Vo; p); 
and hence dy S/,?/(47) implies that 
[A(A; wo, 20; < L(A*?; uo, v0; p). 


Since this holds for every circular disc (w—o)*+(v—v0)*<p* comprised in 
u*+v? <1, it follows (see §1.2) that \1/* and consequentlyf is of, class PL 
in u*+v? <1. Hence (see §0.2) K $0 on S. Since S was any portion of the given 
surface, this proves that K <0 on the whole surface. 

2.7. The reasoning of §2.6 can be replaced by the following argument. 
Take any point O on the surface and denote by S, the portion of the surface 
which consists of the points of the surface within and on the geodesic circle 


}.A differential geometer will probably find the proof of §2.7 preferable. 
t See footnote on p. 670. 


672 E. F. BECKENBACH AND TIBOR RADO [July 


with center O and radius p. Then the area and perimeter of S, are functions 
of p which admit of developments beginning as follows: 


1 
a = 2 op* 
(o) = mp op* + 


= 2xp — + 


where K, is the Gauss curvature at the point O. We have then 


1 1 
— —I(p)? = —rKopt+---, 
a(p) (p) + 


1 
>) U(p)? 


Ko = — lim 


T p* 


Since, by assumption, the numerator is <0, this proves that Ky) <0. Since O 
is any point on the surface, we have then K <0 on the whole surface. 

2.8. Let us now consider the sign of equality in the isoperimetric in- 
equality. In order to illustrate a very trivial point, let us consider a Jordan 
region in the plane, bounded by a rectifiable curve which is not a circle. Then 
we have a</?/(47). Putting some hills on this plane region, we can keep the 
perimeter / fixed and increase the area until we have a=/?/(47). Since our 
hills were otherwise quite arbitrary, it is clear that from a =/*/(47) alone we 
cannot conclude anything concerning the surface. On the other hand, if we 
restrict ourselves to analytic surfaces with K <0, and if we use the fact that 
K=0 on such a surface as soon as K=0 on any subregion, then the lemma of 
§2.2 yields immediately the following result. 

If an analytic surface with K <0 contains some portion for which the sign of 
equality holds in the isoperimetric inequality a<l*/(4r), then K=O on the 
surface, and a=I*/(42) holds only for the geodesic circles. 

2.9. In what precedes, we extended a theorem, previously proved only for 
minimal surfaces, to surfaces with K <0. A systematic study of similar gen- 
eralizations might lead to interesting results. We mention here a few im- 
mediate facts. 

Let S be a piece of surface with K <0, which admits an isothermic rep- 
resentation 


(17) S: = x(u, 2), y = y(u, 2), 2 = z(u,v), +o? S p’, 
with the properties described in §2.1. Put again E=G=X(u, v), and suppose 


t See for instance L. P. Eisenhart, Differential Geometry, Ginn and Company, 1909, p. 209. 


1933] SURFACES OF NEGATIVE CURVATURE 673 


that (0, 0) =1 (that is to say, that the linear magnification is unity at the 
origin). Denote by /(r) the length of the image of u?+2?=r?, and by a(r) the 
area of the image of u?+0?<r?. Then 

(a) 1(r) is an increasing function of rt; 

(b) U(r) 

(c) a(r) 

We have 


= A(r cos ¢, r sin 
0 


Since K <0, it follows§ that (uw, v)!/? is of class PL. Also (see §1.1), r= 
\4-+é0 | is of class PL. Therefore|| of class PL, and consequently r\*/2 
is subharmonic. (a) follows then from the above expression for /(r) and from 
the fact that the integral mean of a subharmonic function is an increasing 
function of r.tT 

To prove (b) and (c), observe that 


a(r) = 0, 0; 7), 


18 
(18) = 0, 0; r). 


On account of K <0, \ and consequently \!/? are subharmonic. Hencef{ 


1 = X(0, 0) S A(A; 0, 0; 7), 


(19) 
1 = (0, 0)? < 0, 0; 7). 


Thus (b) and (c) follow from (18) and (19). 


+ For the plane case, see L. Bieberbach, Uber die konforme Kreisabbildung nahezu kreisfirmiger 
Bereiche, Berlin Sitzungsberichte, 1924, pp. 181-188; for minimal surfaces, see T. Rad6, Some re- 
marks on the problem of Plateau, Proceedings of the National Academy of Sciences, vol. 16 (1930), 
pp. 242-248, and On Plateau’s problem, Annals of Mathematics, vol. 31 (1930), pp. 457-469. 

t For the plane case, see L. Bieberbach, Palermo Rendiconti, vol. 38 (1914), pp. 98-112; for 
minimal surfaces, see E. F. Beckenbach, The area and boundary of minimal surfaces, Annals of 
Mathematics, vol. 33 (1932), pp. 658-664. 

§ See the authors’ paper in these Transactions, vol. 35, pp. 648-661, §1.8. 

|| The product of two functions of class PL is again a function of class PL; see the authors’ 
paper in these Transactions, vol. 35, pp. 648-661, §1.8. 

{| Every positive power of a function of class PL is a subharmonic function; see the authors’ 
paper in these Transactions, vol. 35, pp. 648-661, §§1.7 and 1.8. 

tt See F. Riesz, Acta Mathematica, loc. cit., first part, p. 338. 

tt These inequalities express two of several clearly equivalent definitions of subharmonic func- 
tions. See J. E. Littlewood, On the definition of a subharmonic function, London Mathematical So- 
ciety Journal, vol. 2 (1927), pp. 189-192. 


674 E. F. BECKENBACH AND TIBOR RADO 


Corotary. If the sign of equality holds in (b) or (c) for any value of r, 
0<r<p, then d(u, v)=1, so that (see §2.2) the map (17) is isometric. In other 
words, S is a developable piece of surface and is a geodesic circle given in iso- 
metric representation. 


To see this, consider for instance the sign of equality in (c); then 


1 
= — ff A(u, v)dudo. 


Cons ‘uently A(0, 0) =A(0, 0), where A(u, v) is the harmonic function in 
coincidi:g with A(u, v) on u?+v?=r?, and thereforef A(u, v) 
=h(u,v). that 

v) = 0. 


But A(u, v) is als. of cass PL, so that (see §0.2) 
AAA — (A2 +A?) 2 0. 


Consequently d,?-++A,?<0 and therefore A(u, v) is constant. But A(0, 0) =1, 
so that A(u, v)=1. The same argument holds for the sign of equality in (b), 


with A(w, v)'/* in place of A(w, 2). 


Tt \(u, v) is subharmonic, and therefore, by the definition of subharmonic functions (see F. Riesz, 
Acta Mathematica, loc. Git., first part, p. 331) ¥(u, »)=A(u, v) —h(u, v) is subharmonic. We have 
¥(r cos ¢, r sin ¢)=0, ¥(0, 0)=0. But a subharmonic function cannot attain an interior maximum 
unless it is identically constant (see above reference to F. Riesz, p. 331). Therefore ¥(u, v)=0, 


A(u, v) =h(u, 2). 


STATE UNIVERSITY, 
Co.umesus, OHIO 


A TRANSFORMATION OF THE PROBLEM 
OF LAGRANGE IN THE CALCULUS 
OF VARIATIONS* 


BY 
LAWRENCE M. GRAVES 


By means of a simple transformation suggested by Bliss, the problem of 
Lagrange may be reduced to one in which the side conditions are integral 
equations rather than differential equations, and no derivatives enter ex- 
plicitly. A multiplier rule for the transformed problem is derived below, in 
which the multipliers are all constants. When the inverse transformation is 
applied to this multiplier rule, formulas are obtained for the non-constant 
multipliers occurring in the ordinary form of the Lagrange multiplier rule, 
and it is seen that the constant multipliers obtained here may be identified 
with certain constants appearing in the ordinary form of the rule. 

In connection with his applications of the calculus of variations to prob- 
lems in economics, Roost has been led to consider a generalization of the 
problem of Lagrange in which integro-differential equations occur among 
the side conditions. The transformation and analysis given below apply 
with equal facility to Roos’ problem. 

For normal arcs an analogue of the Weierstrass condition is derived 
for the transformed problem. It is not necessary to assume that the mini- 
mizing arc is normal on sub-arcs. For such problems as that of Roos, no 
generalization of the Jacobi-Mayer condition has to my knowledge yet been 
obtained, though several attempts have been made. 

1. The transformation of the problem. We shall start with the problem 
suggested by Roos, in the following form: To find necessary conditions on 
a curve 


yi = 5% x), 


which minimizes an integral 


I= y, 


in a certain class of curves satisfying the integro-differential equations 


* Presented to the Society, December 31, 1930; received by the editors November 21, 1932. 
t Generalized Lagrange problems in the calculus of variations, these Transactions, vol. 30 (1928), 
pp. 360-384. 


675 


L. M. GRAVES [July 
y(x), y’(x), u(x) | =0 1, < n), 


and the end conditions y;(x:) =ya, yi(%2) =yiz. The functions f, ¢., and P, 
are supposed to have continuous first partial derivatives with respect to all 
their arguments in a certain region R of (2n+q+2)-dimensional space. The 
curves admitted to consideration are supposed to be of class D’, i.e., the 
functions y;(x) are continuous and their derivatives y,/(x) have at most a 
finite number of ordinary finite discontinuities. Admissible curves are also 
supposed to have all their elements 


[x, 5; y(s), y'(s), u(s) | (xy 5 x %2) 


interior to the region R. We shall suppose also that along the minimizing 
curve the matrix of partial derivatives day’ (a=1,---,m;i=1,---,n) 
has rank m. For simplicity we suppose that the minimizing curve itself is of 
class C’. 

Then as Bliss* has shown, additional functions ¢,(x, y’) (r=m+1, 

-, m) may be adjoined, with the same continuity properties as the 
original functions ¢., so that the functional determinant |¢,,/| does not 
vanish along the minimizing curve. Hence the equations ¢;(x, y, 9’, u) =3; 
have a unique continuous solution 


with (x, y, y’, 4, 2) near the values along the minimizing curve, and the 
functions y; have continuous first partial derivatives. If equations (1) are 
used to eliminate the y,’, the integral J becomes 


1 


and the side conditions become 


= + f “vals, 9(s), u(s), 2(3)]ds (i 


= 0 +, 
The end conditions are y;(%2) (¢=1,---, m). 


* The problem of Mayer with variable end points, these Transactions, vol. 19 (1918), p. 312. 


1933] THE PROBLEM OF LAGRANGE 677 


2. The multiplier rule, and an analogue of the Weierstrass condition, for 
the transformed problem. We shall now consider the new form of the problem 
on its own merits, and in order to simplify the notation in this section, we 
reformulate it as follows: To find necessary conditions on a “curve” 


(2) = Zp = 2,(x) (t=1,---,m;r=1,---,»), 


which minimizes an integral 


r= als, 96), 


1 


in a certain class of curves satisfying the integral equations 
(3) = yar +f Vilx, s, y(s), 2(s)]ds (i =1,- 


and the end conditions 
(4) yi( xe) (i 1, p Nn). 


The functions g(s, y, z) and y.(x, s, y, 2) are supposed to be continuous 
and to have continuous partial derivatives with respect to their arguments 
y; and z, in a certain region R of (x, s, y, 2) space. The curves (2) admitted 
to consideration are supposed to have all their elements (x, s, y(s), 2(s)) 
interior to R, and the functions y,(x) are supposed to be continuous, while 
the functions 2,(x) have at most a finite number of ordinary finite discon- 
tinuities. 

Under these circumstances, the equations (3) have a unique solution 
yi(x) =Y;[z |x] for each set of functions z,(x) near those associated with the 
minimizing curve, and the functionals Y; have differentials* ;(x) =dY;[z; 
¢ |x] which satisfy the equations of variation 


(5) az) = f 8)ni(s)ds + f “Vind a, 


Here and elsewhere we abbreviate such expressions as Wiy,[x, s, y(s), 2(s) ] to 
Wiy,(x, Ss). When the functionals Y;[z] are substituted in the integral J, it 
becomes a functional J [z], which is to be minimized in the class of functions 
2,(x) for which Y;[z|x2]=yi2 (¢=1, - - - , p). The functional J[z] also has a 
differential given by 


* See, e.g., Graves, Implicit functions and differential equations in general analysis, these Trans- 
actions, vol. 29 (1927), pp. 514-552. 


678 L. M. GRAVES [July 


where the n; are determined by equations (5). If J [z] is a minimum, the usual 
argument shows that there exist constants Jo, ¢:, - - - , Cp, not all zero, such 
that 


(6) lodJ [23 + slo; x] =0 


for all functions {, having only a finite number of finite discontinuities. If we 
set 


(7) G(s, z) log(s, z) + 5, 9; z), 


i=1 
equation (6) becomes 


(8) f + G.,(s)t-(s) }ds = 0. 


Let S;;(x, s) denote the reciprocal kernel matrix fdr the Volterra system (5), 
so that its solution is given by 


ni(x) -f Wiz,(x, 


(9) 


f f Vinlt, s)t(s)dsdt. 


By substituting (9) in (8) and making certain interchanges in the order of 
integration, we find 


f = 0 


1 


for all ¢,(x), where 


d(x) = G.,(2) + f 


Hence we have proved the 


ANALOGUE OF THE LAGRANGE MULTIPLIER RULE. If the functions yi(x), 
Z-(x) minimize the integral I in the class of all such functions satisfying the 
integral equations (3) and the end conditions (4), then there exist constants ly, 
C1, * Cp, not all zero, such that 


THE PROBLEM OF LAGRANGE 
G.,(x) + f Gy(OWie(t, x)dt 


(11) t)Wee(t, x)didu = 0 
(11 


where G(s, y, 2) is defined by equation (7), and S;x (x, s) is the reciprocal kernel 
matrix for the system (5). 

We shall say that a curve yi:=y,(x), z-=2,-(x) (a1 is normal in 
case there exist sets of variations (o=1,---, p), satisfying 
the equations of variation (5) and such that the determinant |n;.(x2) | (3, 
o=1,---+, p) does not vanish. The usual considerations show that an arc 
is normal if and only if it has no set of multipliers Jo, 1, - - - , Cp, with 1) =0, 
with which it satisfies the equations (11). For a normal minimizing arc we 
may always assume /,=1, and then the remaining multipliers are uniquely 
determined. 

ANALOGUE OF THE WEIERSTRASS CONDITION. If the minimizing curve 
for our problem is normal, and if ly is taken equal to unity, then for every element 
(x, y, 2) of the minimizing curve and for arbitrary numbers Z,, the expression 


G(x, Z) G(x, z) 


+ f f bas | =, 9,2) 7,3) 


cannot be negative. 

This theorem may be proved by the method of the author’s paper The 
Weierstrass condition for the problem of Bolza in the calculus of variationst, 
as follows. Let =1, - - - , p) be an admissible set of functions satis- 
fying the equations of variation (5), and such that the determinant |1;.(:2) | 
~0, wherei,o=1,---, p. Let x1<%3<4%2, and 


2,*(x, B, €) = 2-(x) + on SX 
= Z, on %3< % 


When the functions z* are substituted for z in equations (3) these equations 
determine functions y;=y,*(x, B, €) defined for %<4%2, (8, €) near (0, 0), 
which are continuous and have partial derivatives with respect to 6 and e, 
which are continuous except that the partial derivatives yis* may be discon- 


t Annals of Mathematics, vol. 33 (1932), pp. 747-752. 


1933] 


680 L. M. GRAVES [July 


tinuous in x at x =2;+. Set I(y*, 2*) =1(6, €). We are supposing that (8, «) 
has a minimum for 6 = e, =0. Then if the equations 


(12) €) I(0, 0) + v, yi* (xe, B, €) = Vie (i 1, p) 


have a solution 6(v), €,(v) near »=0, we must have 8’(0) 20. By differentiat- 
ing equations (12) with respect to 2, we find for B = e, =0, 


6;(x2)8’ + = 0 1,---, 


where 0;(x) =~yis*(x, 0, 0). Multiply the last » equations by the constants 
1, * +, €» respectively, and add to the first. By equation (6) the result is 


[Is ]=1. 


Lid 
E=Ig+ Dicbi(x2) = 0. 


Now the functions 6;(x) satisfy the equations 
6:(x) = 0 


6(x) = A;(x) +f s)6;(s)ds 


where =i(x, xs, Ys, Z) —Wilx, 3, Ys, 23), Ys=y(Xs), 23 =2(x3). Hence by 
use of the reciprocal kernel S;; of iy, 


= Ada) — f Sila, 
z3 
By direct calculation 
z3 


Combining these results we find 
E = G(%3, V3; Z) G(xs, V3, Z3) 


4 f "G,,(x) f Sal, | dx 


which reduces by an interchange of order of integration to the expression 
given in the theorem. 


Hence 

(x, S < 23), 


1933] THE PROBLEM OF LAGRANGE 681 


3. Application of the inverse transformation to the new multiplier rule. 
Returning now to the problem of §1, we shall for simplicity consider only the 
case when the functions ¢, are independent of the u’s, that is to say, the or- 
dinary Lagrange problem with fixed end points. Then the functions y; of 
§2 are independent of x, and =n. We shall understand that the indices used 
here have the following ranges: i, 7, k, /=1,---,m; a=1,--+-+,m; r= 
m+1,--.-,%. From the definitions of the functions S;;,and G, we obtain 
the following relations: 


(13) = Sri, 
(14) Vie Pivs = — 
(15) vinta) — f = — Si(0, 2), 
(16) = (lof; + 
(17) Gy, = + + 
The analogue of the Euler-Lagrange equations may be written 
+ f [ent f Salt | dt = d(2), 
(18) A(x) = 0. 


If we multiply equations (18) by ¢i,, and add, use equations (13) and (16), 
and interchange the order of integration in the double integral, we find 


(19) lofy + f G,,(i)dt — f f = 
z z t 


Also if we multiply equations (18) by ¢:,, and add, we find with the help of 
equations (14), (15) and (16), 


(20) Gy jr(v, x)dv = (lofy + Pin + 
Combining equations (20) and (17) with (19) we find 


lof + +f (lof: AaGay,)dt AaGay’ 


which may be written in the familiar form 


682 L. M. GRAVES 


(21) Fy -f Fy dx & 
72 


by setting F =lof —Aada. 
If we apply the inverse transformation in the more general problem con- 
sidered in §1, we find in place of equations (21), 


Fy,(x) = fF s)dt 


+ Fu,(s)Pry,(s, ds — ¢. 


UNIVERSITY OF CHICAGO, 
Carcaco, Itt. 


CONTRIBUTIONS TO THE THEORY OF TRANS- 
FORMATIONS OF NETS IN A SPACE S,* 


BY 
V. G. GROVE 


1. INTRODUCTION 


Let there be given a surface S in euclidean space of n23 dimensions. 
Suppose that through each point x of S there passes a line g of a congruence 
G. The developables of G intersect S in a net of curves NV. We have called 
such a net aC net.T 

Let S be another surface in the same space S,, in one-to-one point cor- 
respondence with S, corresponding points lying on the lines g of G. The 
developables of G intersect S in a C net of curves V. The two nets N and NV 
are said to be in relation{ C. 

The tangent planes to S and S intersect in a line h. If the points of h are 
each equidistant from the corresponding points of S and S, we shall say that 
the nets NW and NW are in relation E. 

We propose in this paper to develop a theory of the relations defined above 
which is independent of the dimension of the space S, for 23. 

Let the coordinates of the point x on S be x, x2, - - - , %n, the coordinates 
of the corresponding point # on S be #1, #:,---, %n, and the direction co- 
sines of the line g joining them be Aj, As, - - - , An, Where 


a2 = 1. 


i=1 


Let the parametric curves on S and S be chosen as the curves of the given 
nets NV, W on these surfaces. The pairs of functions (x, #) and the number pair 
(1, 1) are solutions of a system of differential equations of the form§ 


w= mx, — Ax + AZ, 


(1) 
Z, = nx, — But+ Bz. 


* Presented to the Society, April 14, 1933; received by the editors January 15, 1933. 

t V. G. Grove, The transformation C of nets in hyperspace, these Transactions, vol. 33 (1931), 
pp. 733-741. Hereafter referred to as C. 

tC, p. 733. 

§ C, p. 734. 


683 


684 V. G. GROVE 


The coordinates of the point ¢ are of the form 
(2) E=x+. 


The pairs of functions (x, \) are solutions of the following system of differen- 
tial equations: 

Au = UX, + ar, 
(3) 

Ay = + Bx, 


wherein 


w= (m—1)/6, a= — E= > 
v=1 


y=(n—1)/6, B= — cos G= xiv, 
t=] 


and @™ and 6 are the angles between g and the tangents to »=const. and 
u=const. respectively. 
From (2) we find that 
( Ge cos = Gil2 cos + bv, 


wherein E, G etc.-bear the same relation to W as the corresponding quantities 


bear to NV. 
The focal points & and 7 of the line g have the coordinates 


If uv =0 one or both of the families of developables of G are cylinders. 
The tangent planes to S and S at x and intersect in the line determined 
by the two points 


(6) r=x—mx,/A, s = x«— nx,/B, 0. 


If AB=0, one or both of the curves of the net W are parallel to the corre- 
sponding curves of WV. The points r and s are equidistant from x and ¢ if 


(7) Aé + 2mE'? = 0, Bé + 2nG'/? cos 9 = 0. 


We may readily verify that equations (7) are necessary and sufficient con- 
ditions that the nets WV and N be in relation E if not both A and B are zero. 
Equations (7) are of the form 


wherein P, Q, P’, Q’ are independent of 6. 


(July 


1933] TRANSFORMATIONS OF NETS IN S, 685 


If we differentiate the first of equations (1) with respect to v and the 
second with respect to u, we find that if m—0, 


(8) Luv = ax, + bx, — Mx + Mi, 

wherein a, b, M are defined by 

(m — n)a = B(m—1)—m,, (m—n)M = B,— Ay, 
(n — m)b = A(n — 1) — my. 

If m—n=0, we find that 

(10) Bim —1)—m,=0, A(n—1)—m =0. 


(9) 


In case that WN is not conjugate, and C is not radial, we find from (8) that 
E= (Xue — axy — bx,)/M, 
= (Xu» — — bx,)/(6M). 


If the net N is conjugate, or if C is radial, the congruence G is not determined 
by the net WN alone. 


(11) 


2. CONGRUENCES SEMI-NORMAL TO A NET 


A congruence G will be said to be semi-normal to the net corresponding to 
the developables of the congruence if the lines g of G are perpendicular to the 
tangents of one (only) of the families of curves of the net. In particular sup- 
pose that the line g is perpendicular to the tangent at x to the curve v= 
const. Suppose that the transformation C is a transformation E£. It follows 
from (2) and (3) that 


A=6,=0. 


Hence if a congruence is semi-normal to the net N in which the developables of 
the congruence intersect the sustaining surface S of N, the congruence is semi- 
normal to any E transform of N. Moreover the distance between corres pond- 
ing points x and & on the curves of N and N to whose tangents the lines g are nor- 
mal, is a constant, and the tangent lines to these curves are parallel. 


3. TWO-PARAMETER FAMILIES OF LINES NORMAL TO A SURFACE 


Let TI’ be a two-parameter family of lines, such that through each point 
x of S there passes one and only one line / of I’. Suppose furthermore that 
this line / is perpendicular to the tangent plane to S at x for all points x on S. 
We shall say that Tis normal to S. Let the direction cosines of / be ji, 12, - - - , 
1,. It follows therefore that 


= 0, = 0. 


686 V. G. GROVE 


Consider a curve C on S with parametric equations 
u= u(t), v = v(t). 


Any point y on the tangent ¢ to C at x has coordinates defined by an expres- 


sion of the form 
du 


As x moves along C the point y describes a curve, the direction cosines of 
whose tangent are proportional to expressions of the form 


P( + + + L(xu, Xe), 


wherein L(x,, x,) is a homogeneous linear function of the indicated arguments. 
The line / is perpendicular to the tangent to the locus of the point y if and only 
if C is an integral curve of the differential equation 


(12) Ddu? + 2D'dudv + D’dv? = 0, 
wherein 
(13) D= = D! = 


We shall call the net defined by (12), in case a net is so defined, the A net of T. 
We readily verify that the line l is normal to the osculating plane at x of any 
curve of the A net of 1. The A net of T is indeterminate in case T is normal to 


every plane of the two-osculating space S.2,o of S at x. If the parametric net is 
a conjugate net it follows that D’=0 for the A net of every two-parameter 
family of lines T normal to S. 

Suppose now that I is a congruence G. Let the parametric curves be the 
curves in which the developables of G intersect S. Equations (3) may be 
written 


(14) Au = MXu, Av = VX. 
It follows therefore, that, if C is not radial, the functions x and \ each satisfy 
differential equations of the Laplace type. Moreover 

F= 2,2, = 0, 


F= = 0. 


Hence if a congruence is normal to a surface its developables intersect the surface 
in an orthogonal conjugate net. Moreover the net of curves of the spherical 
indicatrix of G corresponding to the developables of G is an orthogonal conjugate 
net. 


[July 


1933] TRANSFORMATIONS OF NETS IN S, 687 


If the parametric curves are not the curves in which the developables of 
G intersect S, the curves in which these developables do intersect S are de- 
fined by the differential equation 


(15) (ED’ — FD)du? + (ED” — GD)dudv + (FD” — GD’)dv? = 0. 


We remark at this point that @ given surface S cannot possess a normal 
congruence unless it sustains an orthogonal conjugate net. Moreover it cannot 
possess more than one such normal congruence unless the developables of such other 
congruence intersect the surface in the same net (15). The tangents to the curves 
of the A nets of such other congruences belong to the same involution, namely 
that determined by the tangents to the minimal curves and the tangents to 
the curves of the A net of the given normal congruence. 


4. THE RADIAL TRANSFORMATION E 
Suppose that the transformation C is radial. It follows that 
m—-n=0. 
Suppose also that C is an £ transformation. From (7) and (10), we find that 
(16) 52 = k°(m — 1)2/m, 


wherein & is an arbitrary constant different from zero. Conversely if m—n 
=0, and equation (16) is satisfied, so also are equations (7). If r and # denote 
the distances from the points x and # respectively to the focal point of g, 
we find readily that 


r? = 5°m/(m — 1)? = 
Hence if two nets are radial transforms in relation E, they are transforms of one 
another by a transformation by reciprocal radii, and conversely. 


Equations (1) for a transformation by reciprocal radii assume the follow- 
ing simple form: 


= k*p?x, — (x — log — 1), 
Ou 


(17) 


0 
= (x — log (ku? — 1). 
v 


Equations (17) may readily be integrated. The solution for a proper choice of 
the constants of integration may be written in the form 


(18) = 


688 V. G. GROVE 


The lines joining x to # evidently pass through the origin. Moreover from (5) 
the quantity —1/y is the distance from the point x to the fixed focal point of 
g. Hence 


These of course are the familiar formulas of a transformation by reciprocal 
radii. 
MICHIGAN STATE COLLEGE, 
East LansInec, Mica. 


ON THE EQUATION P(A, X) =0 
IN MATRICES* 


BY 
WILLIAM E. ROTH 


In the present discussion we shall consider the solution of the equation 


(1) P(A, X) = F(A) X?-* = 0, 


where A is a known m Xm matrix, F;(A) (k=0, 1, - - - , p) are polynomialst 
in the scalar variable A, and X is the unknown Xn matrix. The equation is 
a special case under that of an earlier paper by the author in which the coef- 
ficient matrices are not polynomials in a given matrix, but are known m Xn 
matrices.{ With the restrictions upon the coefficients which we now impose, 
it is possible to establish inequalities limiting the degree and the number of 
the elementary divisors of X —yJ, where X is a solution of (1). These in- 
equalities depend upon a knowledge of the elementary divisors of P(A, yp) 
and of A—XJ, where yw and ) are scalar variables. Certain theorems below, 
particularly Theorems III and IV with appropriate changes, are valid for the 
more general equation of the type studied by the author and others.§ 

Solutions of (1) are taken up under the following hypotheses: (a) that X 
be a unilateral solution on the right (or left) of the polynomials F,(A) (k =0, 
1,---,p—1); (b) that X be a bilateral solution; and (c) that X be commu- 
tative with A. By means of the idea of transversion of matrices as defined in 
§III, we show the fundamental relationship which exists between solutions 
on the right and those on the left of (1), and between these and the bilateral 
solutions if such exist. 


* Presented to the Society, November 30, 1928, and September 9, 1931; received by the 
editors October 7, 1932. 

t Considerations which follow do not require the functions F;(A) (k=0, 1, - - - , p) to be poly- 
nomials. In fact any functions which, together with at most their first »—1 derivatives, may be ex- 
panded into series of non-negative powers of \ are permissible, provided that the characteristic values 
of A lie within or on the circles of convergence of each of the series representing F;(A) (k=0,1, +--+ ,p) 
and their first n—1 derivatives. For information on such functions of matrices the reader may con- 
sult Hensel, Uber Potenzenreihen von Matrizen, Journal fiir die reine und angewandte Mathematik, 
vol. 155, pp. 107-110; Sheffer, A note on matrix power series, American Mathematical Monthly, vol. 
36 (1930), pp. 228-231. 

t Roth, On the unilateral equation in matrices, these Transactions, vol. 32 (1930), pp. 61-80. 
This paper cites several articles on algebraic equations in matrices. 

§ Roth, loc. cit. 


689 


W. E. ROTH [July 


I. PRELIMINARY NOTIONS AND LEMMAS 


DEFINITION. If A(X) = (a;;(A)) (i=1, 2, r;j=1, 2, s), where 
a;;(X) are polynomials in X, if 


a;;(d) = ris(d), mod (A a)” (i 1, 2, 1, 2, 5), 
and if R(d) =(rij(d)), then 
A(A) = R(A), mod (A — a)". 


DeFINITION. If A(A) is an mXm h-matrix whose elements are polynomials 
in x, if 
= RQ), mod (A — a)", 


and if the ith elementary divisor of R(d) corresponding to the linear factor \—a 
is (\—a)*® (i=1, 2,---, p), where p is the rank of R(d), then (\—a)* 
(i=1,2, - - - ,p) is the ith elementary divisor of A(X) and p its rank with respect 
to the modulus (\—a)". 


Derinition. If A(A) is an mXm )-matrix whose elements are polynomials 
in X, if the ith elementary divisor of A(X) corresponding to the linear factor 
A—a is (A—a)*™ (i=1, 2, - - - , r) where r is the rank of A(d), and if 


<n (¢ = 1,2,---,9), 
aM®@>n 


then o is the reduced rank and (\—a)*" (i=1, 2, - - - , @) is the ith elementary 
divisor of A(X) with respect to the modulus (\—a)". 


Plainly if r is the rank A (A) and if p and @ are respectively the rank and the 
reduced rank of A(A) with respect to the modulus (A—a)", then oSpSrsm; 
moreover it should be noted that in either case all minors of order k So are 
divisible by IT‘: (A—a)* and that this &th determinant divisor may con- 
sequently be congruent to zero modulo (A—a)", while the elementary di- 
visors with respect to this modulus are not. In speaking of the elementary 
divisors of A(A) with respect to the modulus (A—a)", it is not necessary to 
designate the linear factor, for it is always that occurring in the modulus. As 
a matter of convenience we shall still call (A—a)* the kth elementary divisor 
of A(A) even when a =0. Thus if |A(A) | is prime to \—a, then each of the 
m elementary divisors of A(A) with respect to the modulus (A—a)” is unity. 


Lemma I. If A(A) is an mXm matrix having elements a;;(d) (i, 7=1, 2, 
-, m) which are polynomials in i, and if the reduced rank of A(X) with 
respect to the modulus (\—a)* is o and its ith elementary divisor is (\—a)* 
(¢=1,2,---,0), then two mXm matrices, and Q(A), of degree n—1 ink, 


690 


1933] ON A MATRIC EQUATION 691 


exist such that P(a) and Q(a) are non-singular matrices whose elements do not 
depend upon n but do depend upon a, and that 


P(A)A(A)Q(A) = SQA), mod (A — a)", 
where 
S(A) = 
and 
$ij(A) = 0, mod (A — a)" 
= (A — mod (A — a)” 
su(A) = 0, mod(A— a)" 
and 
According to a well known theorem* two non-singular mm matrices, 
T(A) and U(A), |T(A) | and |U(A) | independent of X, exist such that 
(2) 


where \—a; (¢=1, 2, - - - , é) are the distinct linear factors in \ common to all 
r-rowed minors of A(A), where 7 is the rank of A(A), and where A,(A) = 
(h=1, 2, - - - , are such that 


a;; = 0 
as; (x) = (A — (i<n), 
as, = 0 (i>r). 


Hence the &th composite elementary divisor of A(A), as usually defined, is 


— a)” (k = 1,2,---,7). 
Since A,(A) (¢=1, 2,---, é) are diagonal matrices, they are commutative 
one with another and (2) can be written in the form 
(3) T(NAQ)U(A) = 
Now if the last » —r zero elements in the principal diagonal of 


be replaced by unit elements, the resulting matrix, A/(A) (¢=1, 2,---, 


* Bocher, Introduction to Higher Algebra, 1922, p. 91, and Theorem I, p. 94. 


692 W. E. ROTH [July 


k—1, k+1, k+2,---, #) may replace the corresponding matrix A,(A) in 
the right member of (3), and this substitution will not affect the validity of 
this equation in that the product by A,(A) leaves this member unchanged. 
The jth element j <r of the diagonal matrix 


= ? G) 


and is a polynomial of degree — ay in — a, whose constant term 
is not zero. Hence the polynomial 2;,(A) exists such that 


- = 1+ (A — 7 S 7, 


where w;,(A) is a polynomial in \. The remaining n—r elements of IT}z} 


A! A/ (A) are unit elements, hence we may let 2;,(A)=1 for r<j 
<n. Hence the diagonal matrix V,(A), having as its jth element 2;,(A) as 
defined above, exists such that 


where |Vz(ax) | #0 and V;,(a,) is again independent of n,. Hence (3) becomes 
T(A)A(A)U(A)Ve() = [Z + — 
where U(a,)V,(ax) is a non-singular matrix independent of m,. Now if we let 
a, =a and let 

T(A) = P(A), mod (A — a)", 
U(A)Vi(A) = QCA), mod (A — a)", 
A;(A) = S(A), mod (A — a)”, 

we have demonstrated that P(A) and Q(A) satisfying the lemma exist. 

Lemna IT. Jf A(A) =(6;;(A)) (¢, 7=1, 2,--- , where 

5s = (A — 0,1,---,”— 1), 
= 0 (hk = 1,2,---,m—1), 


where d,(d) (k=0, 1, - - - ,m—1) are polynomials in \ and where do(a) #0 and 
d;(a) #0, then the degree of the nth elementary divisor of A(X) corresponding to 
the linear factor \—a 

(1) does not exceed nay—n+1 if ay>1 and a>0, 

(2) does not exceed (n+1)/2 if ao=1 and a>0, 

(3) is equal to nao if a, =0. 


The proof of this lemma consists in seeking a lower bound to the degree 
of \—a as a divisor of all (n—1)st-order minors of A(A). The determinant 


1933] ON A MATRIC EQUATION 693 


|A(A) | has the divisor (A—a)"*°, and none of higher degree in \—< if the lem- 
ma is satisfied; hence the difference between the lower bound so sought and 
nao is an upper bound for the degree of the mth elementary divisor of A(A). 

The minor 6;;(A) has the factor (A—a)-, in \X—a; that of 6;,:.2(d) 
(k=1, 2,---+, m—1) is identically zero, whereas the minor of 4;,;,(A) 
(h=1, 2, - - - ,m—1) is ]"-*-! D(A), where D,(d) is the minor 
of order h obtained by dropping the first column and last »—h—1 columns 
and the last n—h rows of A(A). 

Now 


(4) 


and Do(A) =1. If ag>1 and a:>0, we can show by mathematical induction 
on the basis of the recurrence relation (4), that D,(A) has at least the factor 
(\—a)*, hence the minor of any element 6;,;_,(A) (k=1, 2, - - - , m—1) has 
at least the divisor (A—a)"-*—Y«0+* and that of lowest degree among them 
occurs for h=n—1; hence all (n—1)st-order minors of A(A) have at least the 
factor (A—a)"~! in common and for this case the degree of the mth elementary 
divisor of A(A) corresponding to the linear factor \—a is at most may—n+1. 

If ao =1 and a >0, D,(A) and D,(d) have at least the factor \—a. Then 
from (4) it readily follows that (A—a)*/? or (A—a)“+/2 are factors of 
D,(d) according as / is an even or an odd integer. Hence we can infer that 
the minor of 6;,,-,(A) (k=1, 2,---, m—1) is divisible by 
or by (A—a)"~*-1+ ("+ /2 according as h is even or odd. The divisor of lowest 
degree occurs for h=n—1, and is (A—a)”/? or (A—a)~»/® according as n 
is an even integer or an odd integer. That is, if ao =1 and a: >0 the degree of 
the mth elementary divisor of A(A) does not exceed 2/2 or (n+1)/2 according 
as m is an even or an odd integer. 

The third part of the lemma is evident, for the minor of 6,,:(A) is prime 
to A—a, since all terms of its expansion save d,"~1(A) have this factor if 
a,=0. Hence in this case the mth elementary divisor of A(A) corresponding 
to is (A—a) "0, 


II. THE UNILATERAL SOLUTION 
Let the normal form of A be given by A =(A;;), where 
Ai; =0 (i# 
Ay = Ax (¢ = 1,2,---,7), 


and A; (i=1, is an m;Xm;, matrix, m;=n, the elements of 
whose principal diagonal are a; and those in the diagonal directly above are 


694 W. E. ROTH [July 


m;—1 unit elements and the remaining (m;—1)? elements of A; are zeros. 
Hence A —XJ has the simple elementary divisors (A—a;)"* (¢=1, 2, - - -, r) 
and the non-singular matrix Q exists such that 
(5) A =(QAQ". 
Moreover let (1) have the solution X on the right whose normal form is 
X =(X;;) (i, 7=1, 2, - - - , s), where 
Xi; = 0 (i ¥ j), 
Xj, = Xj = + Dj (Gj = 1,2,---,5), 
where J; is the n; Xm; unit matrix, x; is a scalar constant and D; is then; Xn; 
matrix, whose elements are all zeros save those in the kth row and (k+1)st 
column (k=1, 2, - - - , #;—1) which are unities. Thus we may write D;°=1; 
and D;*=0, k2n; . The matrix X —y/J has the elementary divisors (u—x;)" 
(j=1, 2, - - - , s) and the non-singular matrix R exists such that 


(6) X = RXR". 


On substituting for A and X in (1) by means of (5) and (6), and noting 
that Q and R are non-singular matrices, we obtain 


(7) (A) = 0, 


where 7 =Q-!R. In this equation X¥ and R (hence 7) are the unknowns. In 
fact TXT~— is a solution of P(A, X) =0; on the other hand if X is a solution 
of P(A, X) =0, then Q-1XQ is a solution of (1). 

Let T=(T;;), where 7;; (¢=1, 2,--+-,7;7=1, 2,---,5) isan m Xn; 
matrix; then from (7) we readily obtain the rs equations 


(8) = 0 (6 =1,2,---, =1,2,---,8), 


which must be satisfied by the rs independent matrices 7;;. Each of these 
equations provides a means of computing the corresponding T;;, and con- 
sequently 7, provided the matrices X;(j =1, 2, - - - , s) were known. We shall 
seek restrictions upon x; of X; and upon its order 7;. 

Now from X;=x;1;+D; we have 


h 
Xx; = 


and consequently (8), for A; and X;, becomes 


h Por(Ai, k 
(9) Dino T;;D; 


where 


ON A MATRIC EQUATION 


h+k 
This equation must be satisfied by the sub-matrix 7;; of T, in order that (1) 
have a solution whose characteristic matrix, X —yJ, has the elementary di- 
visor (u—x;)"1, where (A—a;)™ is an elementary divisor of A —X/. 
Indicate the m;X1 matrix formed by the (k+1)st column of 7;; by the 
script letter G;;; then 
(nj—1) 
-, Gi ) 
and 


(0) (1) (nj—k—1) 


= (0,---, 0, Gi; Gay ); 


that is, the multiplication of T;; on the right by D;* moves the first n;—k 
columns of T;; k spaces to the right and replaces the evacuated spaces by k 
zero columns. Hence from (9) we readily obtain the equations 

k Po(Ai x; 

h=0 
Multiply these for k=0, 1,---, mj;—1 respectively by 1, p—x;, - 
(u—x,)"*—1 and add the results; the single equation 


(10) P(Aj, = 0, mod — 2x;)", 


is thus obtained, where 
1 
h (h) Xj 


nj—1 
(11) Gilu) = Ga = Ts 


h=0 
(u — 


is consequently an m;X1 matrix whose elements are polynomials of degree 
n;—1 in p—x;. We shall henceforth concentrate upon equation (10) instead 
of (8). 

From (10) and (11), we see that G;;(x;) =! =0, if |P(A;, x;) | ¥0; hence 
also Gi} =( and so on, under the same hypothesis. That is, 7;;=0, if |P(A;, 
x;) |0. Now not all 7;; (=1, 2, - - - , r) can be zero, nor can all 7;; (j=1, 
2,-- +, 5) be zero else T would have n; zero columns or m; zero rows and in 
either case would be a singular matrix. Hence |P(A;, x;) | =0 for at least one 
pair of values of 7 and 7; the necessary and sufficient condition that such be 
the case is that P(a;, x;) =0. We have proved the theorem. 


1933] 695 


696 W. E. ROTH [July 


THEOREM I. I[f the characteristic matrix A—XI, of A, have the elementary 
divisors (\—a;)™ (i=1, 2,---, where m:=n, and if P(A, X)=0 
have a solution, X, on the right (or left) whose characteristic matrix, X —pl, 
has the elementary divisors’ (u—x;)"i, where >);.; nj=n, then every equation 
P(a;, w) =0 (i=1, 2,---, 7) must be satisfied by at least one of the numbers 
x; (j=1, 2, - s) and every equation P(X, x;)=0 (j=1, 2, - - - , must be 
satisfied by at least one of the numbers a; (i=1, 2, - - - , 1r).T 


The above theorem shows where and how the characteristic values x; of 
a solution of (1) must be sought and consequently gives us some knowledge 
of the sub-matrices X; (j=1, 2, - - - , s). For more definite information re- 
garding them, we shall seek restrictions upon ;, in addition to that we al- 
ready know, namely that >°;_, ;=m in order that the non-singular matrix 
T =(T;;) may exist. Such is given by the following theorems. 


TueoreM II. Jf X is a solution of the polynomial equation P(A, X)=0, 
and if X—pl has the elementary divisors (u—x)", (u—x),---, (u—x)”* 
corresponding to the linear factor u—x, then (u—x)’stx+++-+" is a factor of 
|P(A, u)|; moreover if X—pI has the elementary divisors (u—x;)" (j=1, 
2,---, 5s), { P(A, x) }% must be an exact multiple of |A—dI |. 


It is known that if P(A, X) =0, then |P(A, y) | is divisible by |X —y/ |f, 
and this determinant in turn is the product of all its elementary divisors; 
hence the first part of the theorem is proved. Similarly if X is a solution of 
(1), then A satisfies the same equation, where we regard X as the known 
matrix, and consequently |P(A, X)| must be divisible by |A—dJ|. But we 
can readily show that 


| PA, X)| = IL P(A, X;)| = IL{70, x) 


j=1 


Hence the second part of the theorem is proved. 

The restrictions placed upon m; by this theorem are not very severe; 
nevertheless the first part of the theorem places an upper bound upon 1; 
and the second part places a lower bound upon m; (j=1, 2,---, 5). The 
following results are far more restrictive and quite as easily applied in par- 
ticular examples as are the above. 


THEOREM III. Jf P(A, yu) is of rank p; with respect to the modulus (u—x;)"i 
and if the p;th elementary divisor of P(A, uw) with respect to the same modulus 


+ This theorem is in part a special case of one proved elsewhere, Roth, loc. cit., Theorem I, p. 65. 
t Roth, loc. cit., Corollary I, p. 66. 


1933] ON A MATRIC EQUATION 697 


is (u—x;)%s", then the equation (1) may have the solution X whose characteristic 
matrix, X —I, has the elementary divisor (u—x;)"i only if 


aj + pi— 


and if c;6i) <nj, that is, if the reduced rank and the rank of P(A, ») with respect 
to the modulus (u—x;)"i are the same, then X—plI has the elementary divisor 
(u—2x;)"1 at most k times only if 


h! 


P iy 
= mod (u — x;)"i (hk = 04; +1,---,m;— 1); 


= 0, mod (u 


then P(A;, u) is of rank p;;=m;—o;; with respect to the modulus (u—x,)% 
and equation (10) reduces to the following non-homogeneous system of 
pi; equations in the p;; unknowns (h=0, 1, -- , pi;—1)T: 
pet) (u) (u) t@(p) 
0 PO(u) +++ plmi-2)(y) 


(u) 


= (u — 


where g(u) (k=o, o+1,--+, m;—1) are arbitrary polynomials in uy. 
The equation (10) imposes no restrictions upon ¢ (yu), /(u), - - , 
hence each of the first o;; rows of T;; has m; arbitrary elements and its remain- 
ing rows must be such that (11) and (12) are satisfied. From (12) we see at 
once that 

(u — 


[p©(u) 


(y) = 


(A = 0,1,--+, pi; — 1), 


T In this equation and in the remainder of the proof of this lemma we suppress the subscripts 


i and j of oP (u) of o4; and of filed (u) save where ambiguity may arise. 


(95) — pj 

aj 

If 
0 
(12) 


698 W. E. ROTH [July 


where M;(u) is a linear combination of minors of order p;;—h—1 of P(A;, u) 
and consequently has the (p;;—4—1)th determinant divisor in u—<; of this 
matrix as its divisor. We shall now seek a lower bound to the degree of 
asa divisor of t(u) (R=o,o+1, - - - ,m;—1). 

Let p(u) have the factor (u—-x;)*i and t(u) (k=o,0+1, - - - , m;—1) 
have the factor (u—x)"s and let neither have a divisor of higher degree 
than these in »—x;. Moreover, let the kth elementary divisor of P(A;, u) be 
(u—x,)"i (k=1, 2, - - - , pi;). Then the determinant divisor of all minors of 
order g of P(A;, u) is] gSpi;, and 

Pij 
(ms — = = 
k=1 
M)(u) has the factor u—x; at least af} times, and must satisfy 
the inequality 
pig—h—-1 


2a Lat + ais, 


k=] k=1 


(o+h) 
Tij 


or 


(o+h) 
(13) Tij = nN; hej; Qi; 1). 


k=p,j—h 


This inequality evidently establishes a lower bound for the number of zero 
elements in the (¢+h)th row of 7;;, for if ti =k then the elements in the 
first k columns and the (¢ +/)th row of T;; are zero. The least value the right 
member of the inequality may have for any / occurs for h=0, that is, 
(h=0,1, - - - ,pi;—1). Therefore T;; has only zero elements 
in at least the first n;—as columns of the last p;; rows, whereas the first 
m;—pi;=0;; rows have arbitrary elements as was pointed out above. Since 


P(A,, u) S 
P(Az, w)--- 0 


P(A, 


the rank of P(A, y), hence of P(A, u), with respect to the modulus (u—x,)"% 

is p;, where pj=)_;-; pi; and p,; is the rank of P(A,, u) with respect to the 

same modulus. Moreover, if the p; numbers al (k=1, 2,---, ps3 2, 
- ,r) be rearranged in an ascending sequence 


(2) (93) 
aj S a; <---Sa;’ 


then (u—x,)%{" is the kth elementary divisor of P(A, u) and of P(A, y) with 


= 
P(A, | . . . . . . . . . 
0 


1933] ON A MATRIC EQUATION 699 


respect to the modulus (u—x;)"" and a¥/ is the greatest of the numbers 
According to (10) 


(14) P(A, = 0, mod (u — xj)", 


where 


| Ti; 1 1 
i(u) 


T2; 4; 2; 


Gi(u) = 


Gri(u) Fes (u — — 


The matrix T; has (mi:—pi;) =n —p; rows of arbitrary elements and the 
remaining rows have only zero elements in the first 1; —a; columns. Hence 
af? must equal or exceed n;—n+ 9; else T; is of rank less than nm; and T 
would be singular. This proves the first part of the theorem. Now if nj;>a? 
the reduced rank of P(A, uv) is p; and if X—yI have the elementary divisor 
(u—x,)"1 repeated k times then T has k matrices T; all having the same 
n—p; rows of arbitrary elements. Each 7; has at least n;—of? zero columns 
in the remaining rows. Hence k(n;—a?*) cannot exceed »—p; else the cor- 
responding kn; columns of T are of rank less than kn; and T would be singular. 
This proves the final part of the theorem. 
The second part of the theorem may be stated as follows: 


Corotiary I. If the rank p; of P(A, u) with respect to the modulus (u—x;)"1 
is equal to the reduced rank with respect to the same modulus and if (u—x;)*i? 
is its p;th elementary divisor, then the characteristic matrix, X —pI, of a solu- 
tion of (1) cannot have the elementary divisor (u—x;)"i more than (n—p;)/ 
times. 


Plainly if 2; be taken sufficiently large the rank of P(A, u) with respect 
to the modulus (u—2,)"/ is equal to the rank, 7, of P(A, u) in the usual sense; 
that is, all minors of order r+1 and above are identically zero whereas those 
of order r are not all identically zero and in this case by Theorem III we have 
n; Sas” —n+r,where (u —x,)%*” is the rth elementary divisor of P(A, u) corre- 
sponding to the linear factor »—x;. Hence: 


Corotiary II. If P(A, u) is of rank r and if the rth elementary divisor of 
P(A, mu) corresponding to the linear factor »—x; is (u—x;)%”, then no solution 
X of P (A, X) =0 exists whose characteristic matrix, X —pI, has an elementary 
divisor corresponding to the linear factor 4 —x; whose degree exceeds 


(r) 
a; —n+r. 


700 W. E. ROTH 


The following corollary is at once evident from the foregoing. 


Corotiary III. If P(A, yu) is of rank r<n, then X —plI, where X is a solu- 
tion of (1), may have the elementary divisor (u—x)*, where x is an arbitrary 
parameter, only if 

In this case, where x is arbitrary, the rth elementary divisor of P(A, u) 
corresponding to u—<2 is unity, and a“ =0. A more complete discussion of 
this case is given in the paper cited abovet, where the method of computing 
the matrix corresponding to T is covered in some detail. 


THeoreEM IV. Jf P(A, u) has the reduced rank p with respect to the modulus 
(u—x)’, and if P(A, X)=0, then the number of elementary divisors of X —pI 
corresponding to the same linear factor 1—x and whose degree equals or exceeds 
v cannot exceed n—p. 

If P(A, ») has the elementary divisors (u—x)*” (k=1, 2, - - - ,p), where 
a® Sa” <y, and if the remaining elementary divisors of 
P(A, u) corresponding to the same linear factor u —~x are all of degree equal 
to or greater than v, then according to Lemma I there exist matrices R(u) 
and S(u) such that |R(x) |~0 and |S(x) | 0 and that 


R(u) P(A, »)S(u) = Q(x), mod (u — x)’, 


where Q(u) (qii(a)) G,j= 1, 2, n) is given by 


qii(u) = 0, mod (u — x)’ (i ¥ j), 

gii(u) = (u — mod (u — x)’ S p), 

qii(u) = 0, mod — x)’ (i> pp). 

By (5) P(A, »)=QP(A, u)Q-!, hence 
R(u)QP(A, u)Q-'S(u) = Q(u), mod (u — x)’, 
and (14) becomes 
Q(u)S-(u)OD'(u) = 0, mod (u — x)’, 

where 7’ is an mXv matrix formed of v adjacent columns of T and G’(u) 
is the »X1 matrix, whose elements are polynomials of degree y—1 in p—x, 
and ©’ (x) is the first of these v columns of 7’. From this equation we see that 
the element of the kth row of the »X1 matrix S-'(u) QG’(u) is divisible by 
(u—x)’-*”, if k<p, and is prime to u—x, if k>p. Consequently the nX1 


matrix 
= U 


t Roth, loc. cit., §3. 


| 
[July 


1933] ON A MATRIC EQUATION 701 


has p zero elements in the first p rows and arbitrary elements in the remaining 
n—p rows. Now if X—ylI has the & elementary divisors (u—x)”%, 
(i=1, 2,---+, k), and if X is a solution of (1), then for each (u—-x)"* we 
must have 


S-'(x)QGi (x) = Ui, 


where U; has zero elements in at least the first p rows. The reduced rank of 
P(A, mw) with respect to the modulus (u—-x)’#, v;=v, cannot be less than p, 
and S-'(x) is not dependent upon the degree of the modulus (u—<x)’. 
Consequently 


S'(x)Q(Gi (x), (x), Tx (x)) = (Ui, U2, U;). 


The rank of S-'(x)Q is m and the rank of the right member is at most n—p; 
hence in order that the & columns G,;’(x) (i=1, 2, - - - , k) of T may forma 
matrix of rank k, k cannot exceed n—p. The theorem here demonstrated 
is more general than Corollary I, but if ; —al? ) >2, the latter offers the more 
restrictive bound upon the number of equal elementary divisors that X —yJ 
may have. 

TueoreEM V. Jf P(a;, u) =0 has the root x; of multiplicity B;;, and if P(x, 
xj) =0 has the root a; of multiplicity y;:;, if A—XI has the elementary divisors 
(A—ai)™ (t=1,2,---, 7) and if X—pl has the elementary divisors (u—x,)"i 


(j=1,2,---,s) where X is a solution of P(A, X) =0, then at least one n;(j =1, 
2,- ++, 8) must equal or exceed the corresponding nj; for each value of i 
(i=1, 2,---, 7), where 


1 
(8i; > 1, vi; > 1), 


(6; > 1, = 
(Bi; 1, Vii 


= © (Bi; = = 0). 


Under the hypotheses of this theorem neither P(a;, u) (¢=1, 2,---, 1) 
nor P(A, x;) (j=1, 2, - - - , s) is identically zero. Hence the rank of P(A, X), 
where X is a solution of P(A, X) =0, and of P(A, yu) is m. Now if we regard A 
as a solution of (1), where X is the known matrix, then according to Corollary 
II, m; is less than or equal to 6, where (A—a,)*™ is the nth elementary 
divisor of P(A, X). Now at least one of the matrices P(A, X;) (7 =1, 2, - - - , 5) 
must have (A—a,)*” as its mjth elementary divisor corresponding to the 
linear factor \—a;. Lemma II gives us a means of computing an upper bound 


1), 
Nis = 2m; 

mM; 

= 


702 W. E. ROTH [July 


to the degree of the njth elementary divisor of P(A, X;) (j=1, 2,---,5). 
For if we set A(A) = P(A, X;), then 
P. (A, 

and if (u—x,)"*, yi;>1, is a factor of P(a;, then 5;,:4:(A) = Po,1(A, x;) will 
have \—a; as a factor; on the other hand if y;;=1, Po,(A, x;) is prime to 
A—a;. Let P(A, x;) have the factor (A —a,)g*/ but not one of higher degree in 
\—a,; then according to Lemma II and Theorem III, m; cannot exceed 
every m;;, where 

mi; = Bin; — nj +1 > 1, viz > 1), 

nN; + 1 

= 2 1, Vii > 1), 

Miz = (Bi; = 1, 1), 

m;; = 0 = 0, yi; = 0). 


Hence not all ; (j=1, 2, - - - , s) can be less than the numbers n,; defined in 
the theorem. When §;;=7:;=0, then |P(A i, Xj) |~0 and the corresponding 
T;;=0, and we must take m;;=0 since not all T;; (j=1, 2, - - - , s) can be 
zero. Similarly m;; must be taken sufficiently large in case 8;; =i; =0. 

According to the theorem above if 6;;=y:;=1, and if 6,;=0, h¥i, and 
vix=0, we must have n;2m,, for njj=m; and k¥j. The 
mith elementary divisor of P(A;, uw) corresponding to the linear factor 
u—x; is (u—x;)"* because of Lemma I, and has only the ele- 
mentary divisors unity corresponding to the same linear factor if 8,;=0. 
Hence the elementary divisor of highest degree of P(A, yu) corresponding to 
u—x; is (u—x,;)”'. That is, by Theorem III, »;<m;. Consequently under the 
hypotheses here laid down n;=m,; and the equation P(A;, uw) G;;(u)=0, 
mod (u—«,)™, has a solution such that |7;;|~0. We have consequently 
proved the following corollary. 


Corotiary IV. If A—XI has the elementary divisors (\—a,)™ (i=1, 2, 
-- , r) such that a, and if the equations P(a;, =0 (¢=1,2,---, 
r) have the distinct simple roots x;;(j=1,2,---, pi) such that a; is a simple 
root of each of the equations =0 (j=1, 2,---, pi) and that P(az, 
0, (j=1, 2,---, pi), then P(A, X)=0 has p; solutions, X, 
such that X —pl has the elementary divisors (u—x;;)™ (t=1, 2,---,7). 
The numbers x;; in the elementary divisor (u—x;;)"* (¢=1, 2, -- - ,r) can 
be chosen in p; ways and all are distinct, for if x;;=x,., i*h, then P(a,, x;;) = 
P(ax, Xxx) =0, which is contrary to hypothesis. 


ON A MATRIC EQUATION 


III. THE BILATERAL SOLUTION 
DeriniTion. If B=(b;;) (¢=1, 2,---, a; 7=1, 2,---, B), then B’ 
= (bg_j-1,a-i41) is the transverse of B. 
The transverse of a matrix is obtained by reflecting its elements with 
respect to a line at right angles to that with respect to which the transpose 
of the matrix is obtained. For example, if 


bu by 

b b 

B= ber bee RB’ = ( 32 beg 12 ). 

bar bi 
bse 


The following theorems hold: 

The transverse of the sum of two or more matrices is equal to the sum of their 
lransverses. 

The transverse of the product of two or more matrices is equal to the product of 
their transverses taken in reverse order; (AB)'=B'A’. 

If A=aIl+D, where D=(5;;) and 


= (¢ = 1,2,---,m—1), 
= (i+1#/j), 
then A'=A. 


Derinition. If B=(B;;), where B;; (t=1, are 
a; XB; matrices such that =a, 8; then the BXa matrix 


Bt = (B,’), 


where B;;' is the transverse of B;;, is the compound transverse of B with respect 
to the sub-matrices B;;. 


The compound transverse of a matrix depends upon the way it is divided 
into sub-matrices. The following theorems hold. 

The compound transverse of the sum of two or more matrices is equal to the 
sum of the transverses of the addend matrices, provided all addend matrices are 
divided into sub-matrices in the same way. 

If B=(B;;) and C=(Cyx), where B;; (i=1, 2,---, 7; 7=1, 2,-+-, 8) 
are a;XB; matrices and Cy, (j=1, 2,---, 8; R=1, 2,---, are 
matrices, such that ai=a, 8; and then the compound 
transverse of AB is they Xa matrix obtained by multiplying the compound trans- 
verse of B on the right by the compound transverse of A; that is, 


(AB)* = B*A*, 


1933] 703 


704 W. E. ROTH [July 


If A and X are the matrices in the normal forms as given in §11, and if trans- 
version of them is made with respect to their sub-matrices A; (i=1, 2,---, 7) 
and X;(j=1, 2, - - - , s) respectively, then A* =A and X* =X. 

If A is an nXn matrix, then the elementary divisors of A —XI are identical 
with those of (A —I)* for any division of A —XI into sub-matrices, and identical 
with those of A*—XI provided the transversion of A is made with respect to its 
sub-matrices Aj; (i,7=1, 2,---, r) such that Aj; are all square matrices of 
order n; and ni=n. 

If B is a non-singular n Xn matrix and if B~* is its inverse, then for every 
division of B into sub-matrices there exists a corresponding division of B- into 
sub-matrices such that for B and B- so divided 


(B*)-! = (B-)*, 


If B=(B;;) and B-'=(C;,) where B;; are a; XB; matrices and C;; are B; Xai 
matrices, the theorem is satisfied provided >) a: => j Bj =n. 

The idea of transversion and compound transversion of matrices as 
defined above enables us to determine the relationship of a solution on the 
right of (1) to one having the same normal form on the left, and their relation 
to the bilateral solution of the same equation. 

TuHeoreM VI. If A=QAQ-! and X=RXR-, where A and X are the 
normal forms of A and X as defined in §I1, if A—XI and X —yl have the ele- 
mentary divisors (\—a;)™ (t=1, 2,---, 7) and (=1, 2,---, 5) 
respectively, where >>j-, mi=)_j=1 nj=n, and if X is a solution on the right of 


(1) P(A, X) = F(A) = 0, 


k=0 


then 
Xi = R, 


is a solution of 


(15) (A) = 0, 


k=0 


provided 
Ry = 
where Q and R are divided into sub-matrices of order m;Xn and n;Xn re- 
spectively. 
If we assume that (15) has a solution on the left whose characteristic 
matrix has the elementary divisors (u—x,)" =1, 2, ---, s) then 


1933] ON A MATRIC EQUATION 705 


R, must exist such that X:=R,XR,". Hence by a procedure parallel to 
that of §II, we obtain from (15) the following equation: 


(16) (A) = 0, 
k=O 


where U=R,"0. 

Now if we take the compound transverses of the members of (7) with re- 
spect to the sub-matrices A; (¢=1, 2,---,7r), X; (j=1, 2,---,s) and Ty; 
of A, X, and T respectively we have 


> X-"T*F,(A) = 0. 

k=0 
That is U=7™ satisfies (16) provided T satisfies (7), and similarly any U 
satisfying (16) is such that U* satisfies (7). Hence 


U = RQ = T* = R*Q*), 


or Ri =QQ*(R*)-! according to the theorems on the transversion of matrices. 
This proves the theorem. 

In order that X be a bilateral solution of (1) it suffices but is not necessary 
that Ri: =QQ*(R*)-!=R; in other words, that RR*=QQ*. The following 
theorem holds. 


THEOREM VII. In order that X be a bilateral solution of (1) it is necessary 
and sufficient that this equation have a solution X =RXR- on the right such that 
(7) is satisfied by T; and T,=(T;;), not necessarily distinct, and such that 
T:T,* =I, where T,* is the compound transverse of Tz with respect to the sub- 
matrices T;; of order m;X nj. 


If X =RXR-' is a bilateral solution of (1), then T:=Q-1R satisfying (7) 
exists and U=R-10 satisfying (16) exists and U* also satisfies (7). That is, 
T2, some solution of (7), is U* or (R-1Q)*. It is sometimes possible that 
T,T,* cannot be a unit matrix, but if in 7, we permit the parametric ele- 
ments of T; to take another set of values, then it is possible for 7; to be the 
inverse of T* where 7; is so taken. 


IV. SOLUTIONS COMMUTATIVE WITH A 


Little that is general can be said regarding the solutions of (1) which are 
commutative with A besides that already demonstrated to hold for the 
unilateral and bilateral solutions of the same equation. But with certain 
restrictions upon either A or X or on both we can derive such results on com- 
mutative matrices as are set forth in the following theorems. 


706 W. E. ROTH [July 


TueoreM VIII. If AX = XA, if A—XI has the elementary divisors (\—a;)™ 
(i=1,2,---,7), where a;Aa;,i¥j, and 


m, =m2=---2M,, 


and if X —pl has the elementary divisors (p—x;)"i (j =1, 2, - - - , s), where 


but where x;(j=1, 2, - - - , s) are not necessarily distinct; then 


h h 
k=1 k=1 

and 

We shall here use the notation of §II with the understanding that the sub- 
matrices A; (¢=1, 2,---,r) and X;(j=1, 2, - - - , s) of A and X are so or- 
dered that their orders m; and n; respectively form non-increasing sequences 
of numbers. This is in no sense a restriction upon A nor upon X. 

Since AX = XA, we have from (5) and (6) that 


ATXT— = TXT—A, 
where T=R-10. Therefore TXT-' is commutative with A, but the most 


general matrix commutative with A, where all a; (i=1, 2, - - - , r) are dis- 
tinct, is K =(K,x), where 


Kn = (h k), 
Kin = Cola + (4 = 1, 2,---,#), 
and where D, is an m, Xm, matrix having only unit elements in the diagonal 
immediately above the principal diagonal and having zero elements in the 


remaining m,’?—m),+1 places.t Hence 
Ti;X; = KuTi; 5), 
and 
mi-1 


Tijxj + TiDj = coT + + +--+ + 


If co¥#x;, then T;;=0, but not all 7;; (¢=1, 2, - - - , r) may be zero else T 
would have m; zero columns and would be singular. Hence we can assume that 
Co =x; and the equation above reduces to 


Ti;Dj = + cD? + + ys. 


The multiplication of T;; by D; on the right moves the columns of T;; one 


t Kreis, Contributions d la Théorie des Systemes Linéaires, Zurich Thesis, 1906. Hilton, Linear 
Substitutions, 1914, pp. 112-118. 


1933] ON A MATRIC EQUATION 707 


space to the right, and the multiplication of T;; on the left by D;* moves the 
rows of T;; up k spaces. Because of this fact, whether c: is zero or not, T;; has 
at least n;—h zero elements in the first »;—h places of the (m;—h+1)st row 
(h=1, 2,---, m,). If n;>mz,, then at least the first 2;—m; columns of Tj; 
have only zero elements. If any m; exceeds every m; (i=1, 2,---, 5), then 
T will have at least one column of zero elements and this is impossible, con- 
sequently the largest m; cannot exceed the largest m; or 


Sm. 


This completes the first step of a mathematical induction proof. Now suppose 
we have shown that 


h h 
(17) Lim, 
i=—1 i=—1 
and let 


Tr = (Tii) = 1,2,--- 
T® = (§=1,2,---, =ht1,h442,-- 
T® = (Ti) htiat2,--- 


If 441 ma41, then from (17) we have at once 


At+1 


Dim 
i=1 


i=1 


and the theorem holds; however if #141 >4:, further proof is required. The 
number of zero columns in Te. is at least equal to Sit (n:—mrs), for if 
then every n,(i=1, 2, - - - , A4+1) exceeds m; for j =>h+1 and all 
T;; in Ti; have at least the first m;—m,4: columns of zero elements. The 
number of non-zero rows in 7,41 in the same columns where 72; has only 
zero elements is equal to at most p Dries (m;—mnrsi). The rank of the first 


h htt 
a; columns of T will be less than unless 
h+1 h+1 


— Mazi) S — Mansi); 


and the theorem is proved by mathematical induction for all / less than or 
equal to 7 or s. Since 


( 
T, 
where 

h), 
h), 
s), 

f 


W. E. ROTH 


8 
dim: = Lin; =n, 
t=1 


and if the inequality (17) holds, then it is not possible for 7 to be less than s. 
The theorem is proved. 
Corotiary V. If AX =XA, if A—XI has the elementary divisors (\—a;)™ 
(i=1,2,---,97), where a;A~ax, ixk, and 
m, = = My, 


and if X—pl has the elementary divisors (u—x;)"i (j=1, 2,---+, s), where 
#h, and 
= Ns, 


then 


and r=s. 
This corollary is a direct consequence of the theorem above, for from it we 
find that 


h h 
Som; and r2s 


t=1 


and because x;* 2, that 


h A 
> m; < Don; andr ss 


t=1 t=1 


In order that these inequalities hold simultaneously, m; must equal m; and 
r must equal s. 

The above theorem and corollary have obvious application to the solution 
of the equation P(A, X) =0 for X commutative with A. However, matrices 
A and X such that A —XJ and X —ylI have the elementary divisors (A—a;)™ 
and (u—x,)™ respectively are not necessarily commutative, so that even 
when Corollary IV of the preceding section is satisfied it is not a simple mat- 
ter to show that the solutions, whose existence is there established, are also 
commutative with A. 

EXTENSION DIVvIsION, 


UNIVERSITY OF WISCONSIN, 
MILWAUKEE, WIS. 


708 
| 


A SPECIAL INTEGRAL FUNCTION* 


BY 
R. E. A. C. PALEY 


1. Some years ago Collingwood and Valiron{ proposed the problem of 
whether there could exist an integral function whose minimum modulus on 
every circle |z|=r is bounded, but possessing no asymptotic paths. By an 
asymptotic path we mean a continuous path tending to infinity along which 
the value of the function tends to a limit. 

In this paper I show how to construct such a function. It is obtained by 
considering the well known Weierstrassian non-differentiable function 


n=0 
where c(1<c<2) and a (an integer) are suitably chosen. We may observe 
that, if @ is large enough, the Weierstrassian function possesses no asymp- 
totic paths which tend to the boundary |z|=1, while, for sufficiently small c, 
its minimum modulus on circles |z|=r<1 is bounded, and every point of 
the unit circle is an essential singularity for the function. 
2. Consider the function 


Fy(z) = ( { 


where c>1 (say c=3/2), and a is large. We first show that the minimum 
modulus of Fy(z) on any circle |z|=r is bounded, independently of N. 
Clearly, for r>1, Fy(r) does not exceed 


(1) exp (— r — e®?"), 


where B, here and in the sequel, denotes an absolute positive constant (it 
may denote a different constant in different contexts). For r=1, N21, the 
expression (1) does not exceed a fixed constant, and thus it is sufficient to 
consider Fy(z) with |z|<1. 

Let r be a fixed number less than 1. We consider the value of Fy(re*) 
where @ is chosen according to the following rules. We first stipulate that 

* Presented to the Society, June 23, 1933; received by the editors February 4, 1933. This paper 
was written by the late R. E. A. C. Paley in its present form; proof was read by J. D. Tamarlsin. 

t E. F. Collingwood and G. Valiron, Theorems concerning an analytic function which is bounded 
upon a simple curve passing through an isolated essential singularity, Proceedings of the London 
Mathematical Society, (2), vol. 26 (1927), pp. 169-184; p. 182. 


709 


710 R. E. A. C. PALEY [July 


a%@=0 (mod 27), so that also 2%a%0=0 (mod 27), and thus the second factor 


} 


of Fy(z) is real and less than 1 in modulus. There are now a possible reduced 
values of a”—1@ (mod 27). We choose that one which makes 


N 
> c*r™ exp (ia) 
n=N—1 
a minimum. We now choose that one of the reduced values of A¥-?@ (mod 
27) which makes 
> exp (iad) 
n=N—2 
a minimum, and so on. We show that if @ is sufficiently large the resulting 
value of 
N 
exp (iad) | 
n=0 
will not exceed a fixed constant independent of NV. The argument is almost 
identical with that given in an earlier paper.* We have at the first stage a 
possible values of 


N 
(2) exp |. 


n=N—1 


There is one of these for which the angle between the lines joining the point 
c¥r™ to the origin and to the point 


N 
> cr™ exp 
n=N—1 
is less than or equal to 7/a. Then the value of (2) can be seen by elementary 
geometry to lie between 
and sec (x/a) — 

which, if a is sufficiently large (c = 3/2), is certainly not greater than c¥—172"—", 
Thus 


N 
‘ N-1 
min > exp (iad) | < 
n=N—1 
*R.E. A.C. Paley, On some problems connected with Weierstrass’s non-differentiable function, 
Proceedings of the London Mathematical Society, (2), vol. 31 (1930), pp. 301-328; Theorem I, pp. 
304-308, 


1933] A SPECIAL INTEGRAL FUNCTION 711 


Having now fixed the reduced value of a¥-'@ (mod 27), we look for the mini- 
mum value of 


N 


(3) c"r™ exp (ia"8)|. 


n=N—2 
There is certainly one value for the expression (3), such that the angle be- 
tween the lines joining the point 


N 


cr™ exp 


n=N—1 


to the origin and to the point 
N 
> c*r* exp (ia"d) 
n=N—2 
is less than or equal to 7/a. Thus the value of (3) lies between 
and > exp (iad)| sec (=) — 
n=N—1 a 


and, if a is sufficiently large, it does not exceed c’-*r*"~*. An inductive process 
will now show that 


N 
min | > exp (ia@) | <1, 
ae=0| pmo 


and we have shown that, for all values of 7, the minimum modulus of Fy(z) 
on |z|=r is less than an absolute constant. 
The derivative Fy (z) of Fy(z) is 


Now suppose that a is sufficiently large, and that 1—3a-" <r <1 —2a~". Then 
the expression (4) is majorized* by the single term 


(5) 


Indeed the single term (5) exceeds in modulus 


1 
— = a2 6, 


* See, e.g., G. H. Hardy, Weierstrass’ non-differentiable function, these Transactions, vol. 17 
(1916), pp. 301-332. 


712 R. E. A. C. PALEY [July 


while the difference between the terms (4) and (5) is not greater in modulus 


than 
NN 


NN 
1 — N-1 N 


2a ad 


1—2a-" 
ob over exp ( ) i} <= 10-*a%c¥ 


1-—a% 


if a and N are sufficiently large. 
3. We now write 


a 
= ue = (=) 
k=i k 

and set, for abbreviation, 

Si(u) = Fy,(u), a, Bx 
where Ai: =0, while Ni, (R=1, 2,---+) remain to be 
chosen. We write V, = R, = 1 and give an inductive method for choosing N;, Ry 
for k>1. Suppose that we have already chosen N;,---, Nin, Ri, - ++, Riv. 


Since first Fy(z)=O(|z|) for small z uniformly in N, we may choose R; so 
large that 


(6) | | S 2-*,|2| S Ren, = 1,2,---,k—1, 


whatever the value of NV; may be. Next since, for |z|<R/2 we have, uni- 


formly in NV; and R, 
(=)" ] 0(| | 
= ak—1 R—a, 
dz’ L\R 


we may also assume that R, is so large that, whatever the value of NV; may 
be, we have 


d 
(7) |= < 2-10, |z| < 


This finally fixes R,. We now choose V;,> so great that 


d k—-1 
(8) > 108 max |— fm(um)|. 
|s| sR dz mal 


We next observe that, for — (2a)-"* 7/4 <0@< (2a)-** 2/4, r2=1, 


1933] A SPECIAL INTEGRAL FUNCTION 


ge, Ne 


r 
| fx(z) | * exp 


i-a™ 


= exp {- Q-112 (. =) 
1 — 


For —a the argument of u; is and thus in modulus 
does not exceed 


whence, on the range |z|>Ri, <0 <a> «7, 
max | fi(ux)| < exp { — (Be)?"*} 
We may thus increase NV; if necessary so as to ensure that, on the same range, 


(9) max | fx(ux)| 2-*. 


This finally fixes V,. 
4. We observe first that, in virtue of (6), F(z) is in fact an integral func- 
tion. Next (6) and (9) give us 


(10) max { < B, 


lel l=k+1 
Ri | S Rigs, — SOS 


Also Fy,,, is so constructed that for fixed r, satisfying Ri 
(11) min | < B, 
— amr SOS 
The equations (10) and (11) show that if 7 is fixed with RiSrS Ris, then 
main | | < B, 


where B is independent of r, k. Thus, the minimum modulus of F(z) on circles 
|z|=r is bounded. 

5. We have now to show that F(z) has no asymptotic path. To do this we 
show that in certain regions the differential coefficient of F(z) is not only large 
but so large that there can be no continuous path passing through all these 
regions on which F(z) is bounded. 

Consider F’(z) in the annulus 


(12) 1 — 3a-%* S S 1 — 


_ 


714 R. E. A. C. PALEY 


We have 

—fi(ux) = = —u,(ac)%e (=) (1+ 6), 
dz du, Ri 
where |¢|< |10-*, in virtue of the remarks at the end of §2. Now, in the 
annulus considered, when a>6, 


ak Zz Z 
— = — 
Ri Ri 


Also, by (7) and (8), in the annulus considered, 


k-1 


m=1 m=k+1 
and thus 
(13) F'(z) = + 
where |e’ |<3-10-*<10-*x-1, 
Now let ¢ and ¢’ be two points of the annulus (12) and let 
|= | = 10-187. 


Then (13) shows that 


(14) f BycNe Rj + 


4\ Bk 


3 N —1 


< 10-*| — ¢| 10-*c%, 


where 


for we may certainly find a path joining ¢ and ¢’ of length not exceeding 
|¢’—¢|z, entirely interior to the annulus (12). The first term of (14) is 


(16) — (1 + €”), 


where 


[July 
+R, 


A SPECIAL INTEGRAL FUNCTION 


| <”| 10{(1 + 10-4871) — 1 — 10-1} < 1/10. 


Since, finally, in virtue of (12), if a is large enough, 
it follows from (14), (15), (16) that 
(17) | F(¢’) — F(g) | 2-10-4c%. 
Since, for sufficiently large a, the breadth of the strip (12) exceeds 
10*'R, 87", 
(17) shows that there can be no continuous path, crossing the strip, for which 


the minimum modulus of F(z) is less than 10-‘c%* which is arbitrarily large 
with k. Thus there can be no asymptotic path tending to infinity. 


MASSACHUSETTS INSTITUTE OF TECHNOLOGY, 
CAMBRIDGE, Mass. 


1933] 715 


PARAMETRIZATIONS OF SADDLE SURFACES, 
WITH APPLICATION TO THE PROBLEM 
OF PLATEAU} 


BY 
E. J. McSHANEt 


Introduction. In the study of the properties of rectifiable curves x= x(t), 
y=y(t), z=2(d), and of integrals of the calculus of variations taken along 
such curves, the investigator is greatly aided by the fact that the absolute 
continuity of x(#), y(¢) and 2(#) is known to be necessary and sufficient in 
order that the length of the curve be equal to the classical integral [[x’? 
+y’?+2’?]"/*dt, and also by the existence of a parametric representation of 
the curve (in terms of length of arc) in which the defining functions are 
Lipschitzian. On the other hand, let us suppose that the continuous surface 
S, represented by the equations x=x(u, v), y=y(u, v), z=2(u, v), has finite 
area in the sense of Lebesgue. We know no conditions necessary and sufficient 
to insure that the area of S be equal to { {(EG—F*)"/? dud, nor can we in gen- 
eral find a parametric representation of S which enjoys any particularly de- 
sirable properties. However, in a previous paper§ I have found certain con- 
ditions on the functions x(u, v), y(u, v), 2(u, v) which are sufficient to insure 
that the area be given by the classical double integral; and I have shown 
that on the class of all surfaces satisfying these conditions the double 
integrals of the kind usually considered in the calculus of variations have the 
property of semi-continuity. 

It is therefore desirable to show that large classes of surfaces can be given 
representations satisfying the above mentioned conditions. Certainly it is not 
true that all continuous surfaces can be given such representations. However, 
let us restrict our attention to the class of surfaces for which the defining 
functions «(u, v), etc., are monotonic in the sense of Lebesgue (including in 
particular the important class of saddle surfaces||). In the present paper it 
is shown that for every such surface a representation can be found which 
satisfies the conditions mentioned, and is in fact almost as advantageous as 
a Lipschitzian representation would be. 


Tt Presented to the Society, October 29, 1932; received by the editors July 19, 1932, and, after 
revision, December 20, 1932. 

t National Research Fellow. 

§ Integrals over surfaces in parametric form, Annals of Mathematics, vol. 34. 

|| The definition of the term “saddle surface” is given in §1. 


716 


PARAMETRIZATIONS OF SADDLE SURFACES 717 


One might readily suspect that such a representation would not be devoid 
of utility. As a matter of fact, a first use presents itself almost immediately; 
for with little additional effort we arrive at a solution} of the problem of Pla- 
teau, a solution not without interest when viewed as an application of gen- 
eral theorems proper to the direct method of the calculus of variations. To 
those readers who are interested principally in the problem of Plateau and 
only secondarily in the general theorems, I would like to point out that the 
distinctive feature of the present method is not its elegance (in which respect 
it is inferior to its predecessors) but the directness of the line of thought. 
First a solution of the problem of least area is found; then this solution is 
shown to admit of a representation which is in a sense almost everywhere 
conformal, so that the surface has to be minimal. 

Finally, I would like to remark that whatever familiarity I may have with 
this branch of mathematics is due in large part to my conversations with 
Professor Radé. 

1. Monotonic functions and saddle surfaces. Let the function f(u, v) be 
defined and continuous on a point set E consisting of an open set plus its 
boundary. We defineft the monotonic deficiency of f (as Lebesgue did) in the 
following manner. Let R be any open set in E and R* its boundary, and 
denote the maximum and minimum of f on R+R* by L, / respectively, and 
the maximum and minimum of f on R* by Li, l; respectively. Then the least 
upper bound of the quantities L—Z; and 1,—/ as R varies over all open 
sets contained in £ will be called the monotonic deficiency of f. Clearly this 
quantity is 20. 

The function f will be called monotonic if its monotonic deficiency is zero. 

Suppose now that S is a continuous surface§ represented by the equations 


x = x(u,v), y = y(u,v), 2 = 2(u,v), (u,v) on B, 


where B is a region consisting of a Jordan curve and its interior. The surface 
S is said to be a saddle surface provided that for every triple of constants a, 
b, c the linear combination ax(u, v)+by(u, v)-+cz(u, v) is a monotonic func- 


{ Previous solutions of this problem have been obtained by J. Douglas and by T. Radé6. The solu- 
tions by these authors of the problem of Plateau in its usual form are summed up in the following 
papers: 

J. Douglas, Solution of the problem of Plateau, these Transactions, vol. 33 (1931), p. 263; 

T. Rad6, The problem of the least area and the problem of Plateau, Mathematisché Zeitschrift, 
vol. 32 (1930), p. 763. 

tH. Lebesgue, Sur le probléme de Dirichlet, Rendiconti del Circolo Matematico di Palermo, 
vol. 24 (1907), p. 385. 

§ We assume that the reader is familiar with the definition of distance of continuous surfaces 
and of Lebesgue area, as presented, e.g., by Radé (loc. cit., pp. 772-774 ) or McShane (Annals of 
Mathematics, vol. 33, pp. 461-463). 


718 E. J. McSHANE [July 


tion. If S has everywhere a well-defined total curvature, this is equivalent 
to requiring that the total curvature be non-positive; if x=u and y=2, so 
that z=2(x, y), this is equivalent to Radé’s definition. 

It is however our duty to show that the property of being a saddle surface 
is a property of the surface S, and does not depend on the particular represen- 
tation of S. To do this we show that for every surface S it is true that for 
every pair of representations 


(1.1) x = y(u,v), 2 = 2(u,v), (u,v) on B, 
(1.2) = 0), 2 = 0), (a, 0) on B, 


of the same surface S, and for every triple of constants a, b,c, the monotonic de- 
ficiency of ax+by+cz is equal to the monotonic deficiency of ai+b¥+cz. 
The proof of this statement offers no difficulty, and from it our assertion con- 
cerning saddle surfaces follows immediately. 

However, the class of surfaces for which we shall obtain a special represen- 
tation is larger than the class of saddle surfaces, and consists in fact of those 
surfaces for which the defining functions x(u, v), y(u, v), 2(u, v) are each mono- 
tonic. The point particularly to be observed is that this property is independ- 
ent of the particular representation of the surface. 

2. Lemma on convergence. The functions with which we shall be particu- 
larly concerned are those satisfying the following conditions. 


(2.1a) The function f(u, v) is defined and continuous on the unit circle C: 
w+v? 

(2.1b) f(u, v) is absolutely continuous in w for almost all fixed values of 2, 
and absolutely continuous in v for almost all fixed values of wu. 


(2.1c) The integral 


I(f) = f f + laude 


exists.f 


We can now state an extension of a theorem of Lebesgue§ which is to 
play an important role in the following pages. 


Tt See, e.g., Rad6, Geometrische Betrachtungen tiber zweidimensionale regulire V ariations probleme, 
Acta Szeged, vol. 2 (1926), pp. 228-253, especially pp. 229-230. 

t This and all other integrals are understood to be Lebesgue integrals. 

§ H. Lebesgue, loc. cit., p. 386. 


1933] PARAMETRIZATIONS OF SADDLE SURFACES 719 


Lemma 1. Let the functions fi(u, v), fo(u, v),--- be defined and satisfy 
conditions (2.1) on the unit circle C. Suppose further that the sequence {f,(u, v) } 
is uniformly convergent on the circumference of C, and that the monotonic de- 
ficiency of f,(u, v) tends to zero with 1/n, and also that I(f,) SH for every n, 
where H is some constant. Then there exists a subsequence of the {f,(u,v)} con- 
verging uniformly on the whole circle C to a monotonic limit function f(u, v), 
and f(u, v) also satisfies conditions (2.1). 


To prove the existence of a uniformly convergent subsequence of {f,} 
we can follow the proof of Lebesgue{ almost without change. The fact that 
the f,(u, v) are not all identical on the boundary, but merely converge uni- 
formly, causes no trouble. But we find it convenient to replace the circles 
C(r) used by Lebesgue in his proof by squares Q(r); this device will be used 
in the proof of Lemma 2, and we therefore do not give it in detail here. 

It remains to prove that the limit function f(u, v) satisfies conditions 
(2 1). To do this we use a slight modification of a theorem of Fatouf: Jf the 
functions - - - are all non-negative and their integrals 


are all less than a fixed number, then lim inf $,(x) is summable, and 
b 6b 
f lim inf ¢,(x)dx S lim inf f on(x)dx. 


The proof differs only very slightly from that given by Fatou. Returning to 
our uniformly convergent subsequence of the functions f,(u, v) (for which 
subsequence we retain the notation {f,(u, v)}), we find by virtue of this 
theorem that§ 


1 1 
f du-lim int f dv(df,/dv)? < lim inf au-f dv(0f,/dv)? H. 
anf -1 


Hence the set of values of u for which the expression 
(2.2) lim inf f 


t Loc. cit. 

t P. Fatou, Séries trigonométriques et séries de Taylor, Acta Mathematica, vol. 30 (1906), p. 375. 

§ All single integrals in this proof are understood to have the limits — (1—1*)"”2,+(1—1)"/?, un- 
less other ranges are specifically indicated. 


720 E. J. McSHANE [July 


is infinite has measure zero. Likewise, since (0/,/0u)* is summable over the 
unit circle, the integral 


(2.3) f (0f,/dv)*dv 


is finite for almost all values of u. We define E to be the set of values of uw for 
which one or more of the expressions (2.2), (2.3) is infinite; this set has meas- 
ure zero, and we shall henceforward restrict our attention to values of u 
belonging to the complement of E£. 

We now fix upon any value of u belonging to the complement of £, and 
select a subsequence f,(u, v) of our original convergent subsequence (a 
ranges over a subset of the positive integers) for which there exists the limit 


lim (Ofa(u, v)/dv)*dv = lim inf v)/dv)?dv. 


By the proof of a theorem of F. Riesz it is possible to select a subsequence of 
the sequence {f,} which converges everywhere to a function ¢(v) which is 
absolutely continuous and whose derivative d¢/dv is summable together 
with its square, and in addition 


< lim (0f./dv)*dv. 


ae 


But the sequence {f,} converges uniformly to f(u, v); hence $(v) =f(u, 2), 
so that f(u, v) is absolutely continuous in v and 


f (df/dv)*dv S lim inf f (0f,/dv)?dv. 
Applying Fatou’s theorem once again, we obtain 


f du do(af/ae) < du-lim int f 
(2.4) CE CE 


< lim inf ff (0f,/dv)*dudv = H. 


Likewise f(u, v) is absolutely continuous in u for almost all fixed values of 2, 
and 


(2.5) Sf. (0f/du)*dudv < lim inf sd. 


1 F. Riesz, Untersuchungen iiber Systeme integrierbarer Funktionen, Mathematische Annalen, 
vol. 69 (1910), pp. 466-468. 


1933] PARAMETRIZATIONS OF SADDLE SURFACES 721 


Adding inequalities (2.4) and (2.5), wef find that J(f) exists and moreover 
(2.6) I(f) = lim inf I(/,). 

3. Convergence on the boundary. For the sake of compactness, the bound- 
ary curve 

x = x(cos 0, sin 0), y = y(cos 8, sin 0), = 2(cos sin 8) 
of the surface 
x = x(u,v), y = y(u, v), = 2(u, v), +r? <1, 
will be written in the shorter form 
x = x(0), y = y(0), 2 = 2(6). 


The symbols E, F, G will as usual have the respective meanings x2 +y2 +22, 


+23. 
We now proceed to prove our second lemma. 


Lemma 2. Let { Sn} be a sequence of surfaces possessing representations 


(3.1) = Xq(u, 0), = Yn(u, V), = v), + v? S 1, 


in which the functions Xn, Yn, 2n Satisfy conditions (2.1), and for which 


ff (E, +G,)dudv <= H 


for all values of n, H being a fixed number. Suppose further that the boundary 
curves 
(3.2) = y = yn(0), = 2n(8) 
of the surfaces S, approach as a limit a simple closed curve T', and that for three 
distinct values 01, 02, 03 of 8 the sequences of points {2xn(0;), Yn(8:), 2n(0:)} ap- 
proach three distinct limit points (é:, ni, €:) (¢=1, 2, 3). Then it is possible to 
find a representation 
(3.3) « = «(6), y = y(6), = 2(8) 
of the curve T and a subsequence of the {S,} (for which we retain the same nota- 
tion) such that 

lim «,(6) = x(@), lim y,(@) = lim z,(0) = 2(6) 
uniformly in 8. 


T My original proof of this lemma was decidedly more intricate than the above; for the present 
proof I wish here to thank Professor Tamarkin. 


722 E, J. MCSHANE [July 


Let {¢,} be a sequence of positive numbers tending to zero such that 
the distance ||I’,, I'|| between I’, and I is less than e, for every m. If we fix 
upon any topological representation x =x(r), y=y(r), =2(r) of the curve T 
on the unit circumference (the functions x, y, z having period 27), then we 
can find a topological mapping of the unit circumference on itself, expressible 
by the equation 7 =7,,(0), 0 <4 S27, in which 7,(6) is a continuous monotonic 
function such that 0 $7(0) =7,(27) S27, for which 


(3.4) | %n(0) — | < 


with like inequalities for y and z. By Helly’s theorem there exists a sub- 
sequence of the sequence {7,(@)} which converges for each @ in the interval 
(0, 277) to a monotonic limit function 7(6). We reject all 7, and their corre- 
sponding S, which do not belong to this subsequence, and we re-name the re- 
maining subsequence {r,}, and the remaining surfaces {S,}. 

We thus find that for every @ we have lim 7,(6) =7(@); and since x(r), 
y(r), 2(r) are continuous, this implies lim x(7,(0)) =x(r(6)) for every 0. From 
this and inequality (3.4) it follows that for every 0 we have 


(3.5) lim x,(0) = x(r7(0)); 


similar equations hold for y and z. In particular, for the values 6;, 42, 03 of 
the hypothesis we have lim x,(6;) =£;, whence x(r(0;)) =&;; likewise y(r(0;)) 
=ni, 2(7(0:)) Since the three points (€;, ¢;) are distinct, the three 
numbers 7(6;) must also be distinct. 

We now prove that the function 7(@) is continuous. Suppose on the con- 
trary that there is a pointt 0) of discontinuity of 7(@); then 7(@.—0) and 
both exist, and 7(%+0)>7(@.—0). We cannot have 7(6.+0) 
=7(0,—0)+27; for this would imply that 7(@) is constantly equal to 
7(@.—0) on the interval (0, 4), and constantly equal to 7(@.+0) on the 
interval (00, 27), as readily follows from the fact that r(@) is monotonic and 
7(2%) =7(0)+2z. This is in contradiction with the previously established 
fact that 7(0:), 7(@2), and 7(@3) are all distinct. Hence 0 <1(@.+0) —7(@.—0) 
<2; and since the equations x=x(r), etc., map the unit circumference 
topologically on I’, the points 


= 0)), y(7 (Bo 0)), 2(7 (00 0))) 


+ 0)), + 0)), + 0))) 


and 


are distinct. Suppose to be specific that 


+ The following proof is constructed for the case in which 9p is an interior point of the interval 
(0, 27). In case 4 is an end point of the interval, say @:=0, we have only to remember that r(6) and 
all the r,(6) are periodic, and consider them as defined on the interval (—7, 7). 


1933] PARAMETRIZATIONS OF SADDLE SURFACES 


(3.6) + 0)) = — 0)) + > 0. 
Since x(r) is continuous, there exists a positive number 6 (<7/3) such that 
(3.7) | — x(r(0 — 0))| <e (00 5 <@ < 4) 
and 
(3.8) | — +20))| <e (00 <0< 
We now introduce a new coordinate system convenient for our present 
purposes. Let Q(r) be the square (the line-configuration, not the region) 
with center at the point (cos 9, sin 09) and sides of length 2r parallel to the 
coordinate axes. If 4r < 6, the square Q(r) intersects the circumference once in 
the interval (@9—6, 4) and once in the interval (4, 6.+6). To each point 
(u, v) on Q(r) we assign the coordinates (r, s), where s is the length of arc of 
Q(r) from the point at which Q(r) enters the circle C to the point (m, 2), 
measured counter-clockwise. Then (0x,(r, s)/ds)* is equal to (dx,(u, v)/du)? 
or to (0x,(u, v)/dv), according to the side of Q(r) on which (u, v) lies; 
and therefore denoting by s(r) the s-coordinate of the point at which Q(r) 
leaves the circle, we have 


6/4 a(r) 
f f (dx,(r, s)/As)*dsdr 
0 0 


G9) s + (Ax,/dv)?|dudv < H. 


Hence for almost all values of r < 6/4 all the integrals| 
s(r) 
(3.10) f (dx,/ds)*ds 
0 


are finite; and by the theorem of Fatouf 


0 


5/4 a(r) 5/4 
(3.11) f lim inf (Ax,/0s)*ds < lim inf f f (x,/ds)*dsdr S H 
0 0 0 


so that for almost all values of r < 5/4 the expression 
(3.12) lim inf f (Ox,/ds)*ds 
0 
is finite. We define E to be the set of values of r for which one or more of the 
expressions (3.10) or (3.12) is infinite; then E has measure zero. We hence- 


forth consider only values of r which belong to the complement of E, and 


{ Le., as stated in the proof of Lemma 1. 


723 


724 E. J. McSHANE [July 


which satisfy the additional condition that all the functions x,(r, s) are ab- 
solutely continuous in s; the set of values of r thus rejected has measure zero. 
To the integrals (3.10) we now apply the inequality of Schwarz, thus 


finding 
s(r) s(r) 2 1 
f (dx,/ds)*ds = {f 
0 0 s(r) 


= [xa(r, s(r)) — an(r, 0) ]?/s(r). 
Inequalities (3.13) and (3.11) together imply that 


(3.13) 


1 
‘dvi dH. 
s(r) 


But the points (r, 0) and (r, s(r)) are both on the circumference of the circle 
C, and lie each in one of the intervals (@)—6, 0) and (4, 00+); hence by 
(3.6), (3.7), and (3.8) we have 


(3.15) lim inf [x,(r, s(r)) — xn(r, 0) |? = 


5/4 
(3.14) f lim inf [«,(r, s(r)) — xa(r, 0) ]?- 


Also by its definition s(r) <8r. Hence 


(3.16) lim inf [«,(r, s(r)) — x,(r, 0) ]?- > 

s(r) 8r 
which is not summable over the interval 0<r<6/4. This contradicts in- 
equality (3.14), and therefore the assumption that 7() is discontinuous leads 
to a contradiction. 

Therefore the sequence {r,(@)} of continuous monotonic functions 
converges everywhere to the continuous function 7(@). It followst that the 
convergence of 7,(@) to 7(@) is uniform; hence lim 2(r7,(0)) =x(r(@)) uni- 
formly. This with (3.4) implies that lim x,(6)=x(r(@)) uniformly in 0; 
similar statements hold for y, z. Hence the curve 


(3.17) = x(7(6)), y = y(7(0)), = 2(r(6)) 


is a limit curve of the sequence {I}. But {I} has the unique limit I; 
therefore equations (3.17) form a representation of I, and the lemma is 
established. 

4. A theorem on representations. We now proceed to prove our principal 
theorem concerning parametric representations. 


{ Buchanan, H. E., and Hildebrandt, T. H., Note on the convergence of a sequence of functions 
of a certain type, Annals of Mathematics, vol. 9 (1908), p. 123. 


1933] PARAMETRIZATIONS OF SADDLE SURFACES 725 


THEOREM I. If the continuous surface S, represented by the equations 
= E(u, 0), y = = 2(u, + S 1, 


satisfies the conditions 
(4.1) the Lebesgue area L(S) of the surface S is finite; 
(4.2) the curve 


« = £(0), y = 5(0), = 2(6) 
bounding S is a Jordan curve; 


(4.3) the functions £(u, v), #(u, v), 2(u, v) are monotonic; 
then there exists a representation 


= x(u,v),y = y(u, v), 2 = 2(u, 0), + 0? 1, 
of S in which the functions x(u, v), etc., satisfy the conditions (2.1), and also 
satisfy the relations 
(4.4) E =G, F = 0 for almost all values of (u, 2). 


Moreover under any change of parameters u=u(u, 0), v=0(i, 0) representing a 
conformal mapping of the unit circle on itself the three functions x(u(u, @), 
v(u%, 0)), etc., also satisfy conditions (2.1) and (4.4). 

Before proceeding to the proof of this theorem, we observe that the hy- 
potheses are independent of the representation of S. 


Now let {II,} be a sequence of polyhedra tending to S for which the areas 
L(iI,,) tend to L(S); we can assume without loss of generality that none of 
the triangles which form the faces of II, are degenerate. If II, has the repre- 
sentation 


«x = &,(a, 8), y = Fn(a, B), 2 = 2,(a, 8), + 6? S 1, 


then there exists for each m a topological mapping a=a,(u, v), 8=8,(u, v) of 
the unit circle on itself such that lim Z,(a,(u, v), 8.(u, v)) =Z(u, v) uniformly; 
and since #(u, v) is monotonic, this impliest that the monotonic deficiency 
of €n(an(u, v), Ba(u, v)) tends to zero with 1/n; similar statements hold for 
y and for z. But the functions Z,(a,(u, v), Bn(u, v)), etc., form a representation 
of II,; hence by §1 the monotonic deficiency of £,(a, 8) is equal to the 
monotonic deficiency of %,(aa(u, v)), and consequently tends to 
zero with 1/n. 

On the circumference of the unit circle we now choose three distinct points 
A,*, A,*, A;*, and on the curve we choose three distinct points A1, As, As. 


T Lebesgue, loc. cit., p. 385. 


726 E. J. MCSHANE [July 


Since lim II, =S, the boundary curves I’, of the polyhedra II, tend to the 
curve I’; hence on each I, we can choose three distinct points A, A%, A“ 
such that A% approaches A; as m increases. 

From general theorems on the conformal maps of abstract Riemann sur- 
faces{ it follows that any polyhedron II whose faces are non-degenerate ad- 
mits of a parametric representation of the following kind. 

(a) The functions representing II are defined on the unit circle; i.e., II is 
given by the equations 


(4.5) = x(u,v), y = y(u, 2 = 2(u, 2), w+ 1. 


(b) The unit circle is subdivided by arcs into a finite number of curvilinear 
triangles 6,, - - - , 6, and the equations (4.5) carry each 6; in a topological 
way into a rectilinear triangle in «yz-space. 

(c) The triangles 5; are bounded by arcs which are analytic, including end 
points. 

(d) Interior to each triangle 6; the functions x(u, v), y(u, v), 2(u, v) are ana- 
lytic and satisfy the relations 


(4.6) E=G,F =0. 


(e) Three arbitrarily given distinct points A:, A2, A; on the boundary curve 
of II correspond under equations (4.5) to three arbitrarily given distinct 
points A,*, A,*, A;* on the unit circle u?+0?=1. 

For such a representation of II we find without difficulty that conditions 
(2.1) are satisfied. We now choose for each II, a representation 


(4.7) Tin: « = xn(u, v), y = ya(u, v), = v), + 0? S 1, 


satisfying the above conditions; in particular, for the points A:*, A.*, A;* we 
choose the points already so named on the circumference of the unit circle, 
and for the points A,, As, As we choose the points A%, AY, A%. 

We now make use of the theorem that if the functions x(u, v), y(u, 2), 
z(u, v), u’+v?<1, satisfy conditions (2.1), then the area of the surface S: 
x=x(u, v), y=y(u, 2=2(u, v), <1, is equal to 


t See, for instance, Carathéodory, Conformal Representation (No. 28 of the Cambridge Tracts in 
Mathematics and Mathematical Physics), in particular chapter VII. While the theorem on the con- 
formal maps of polyhedra can be obtained as a special case of general facts, it should be observed 
that this theorem was proved by H. A. Schwarz. 

t E. J. McShane, loc. cit. in introduction. This theorem has also been established independently 
and by different methods by C. B. Morrey, in a paper not as yet published. 

This theorem is needed later, but for the present case it is stronger than necessary; equation 
(4.8) can be established by simpler means. Cf. Rad6, loc. cit., p. 774. 


PARAMETRIZATIONS OF SADDLE SURFACES 


ff (EG — 
utt+vicl 


Applying this to the polyhedra II,, we find 
(4.8) = ff (E,G, — F,2)*/*dudv; 


and by (4.6) this implies 


(4.9) = f f 1(E, + G,)dudv. 
ut+ovI<l 


Since lim L(II,,) = L(S), the right member of (4.9) is bounded. Observing that 
the equation lim II,, =S implies that the boundary curves I’, of the polyhedra 
II, converge to I’, we find that all of the hypotheses of Lemma 2 are satisfied. 
We can therefore select a representation x=x(6), y=y(6), z=2(0) of T and 
a subsequence of the {II,} (for which we retain the same notation) such that 


(4.10) lim x,(0) = lim y,(@) = y(@), lim z,(6) = 


uniformly in 6. 

This subsequence now satisfies the hypotheses of Lemma 1, and we can 
therefore select from it a subsequence (which we continue to call {II,,}) for 
which the functions x,(u, v), yn(u, v), Zn(u, v) converge uniformly on the whole 
circle to limit functions x(u, v), y(u, v), 2(u, v) which satisfy conditions (2.1). 
Therefore the surface defined by the equations 


(4.11) x(u, v), y(u, v), v), u? +s 1, 


is a limit surface of the sequence {II,}. But the sequence has the unique limit 
S; hence equations (4.11) form a new representation of the surface S. 

Since the functions x(u, v), y(u, v), 2(u, v) satisfy conditions (2.1), the 
area is given by the classical integral; hence 


f f 1(E +G)dudv = f f (EG)"dudo 
utt+vi<l 


(4.12) 
> f f (BG = LIS). 


On the other hand, we know by (4.9) and (2.6) that 
(4.13) lim = lim f f }(Ex + G,)dudo > f f 1(E + G)dudo; 
u uttvicl 


and since lim L(II,) = L(S), these inequalities imply 


1933] 727 


728 E. J. McSHANE 


(4.14) L(S) = f f }(E +G)dudo = f f (EG — F)"dudo. 
u 


Now (E+G)/2 is never less than (EG —F?)"/?, and if EZ, F and G are finite the 
two can be equal only if E=G and F=0. Hence from (4.14) it follows that 


(4.15) E =G, F = 0 for almost all values of (w, v). 


It remains only to show that these properties ((2.1) and (4.15)) of x(w, 2), 
y(u, v), 2(u, v) remain invariant under conformal mappings 


(4.16) u = 0), v = d) 


of the unit circle on itself. We retain the sequence of polyhedra {II,} obtained 
above for which the representing functions x,(u, v), yn(u, Zn(#, ¥) are uni- 
formly convergent. Apply to each of these the transformation (4.16); we ob- 
tain a new representation 


(4.17) x = 0) = x,(u(d, 0), y = 0), = 0). 


Since the mapping (4.16) is conformal, the functions %,(#, 4), etc., continue 
to satisfy all the conditions (a), - - - , (e) stated above. Also the sequences 
#,(#, 0), etc., are uniformly convergent; let the limit function be £(#, 3). 
Then by the definition (4.17) of Z,(a#, 4), we have £(a, 0) =x(u(a, 0), 
with like equations for y and z. All the arguments of the preceding para- 
graph are applicable, and we thus find that £(#, 4), etc., also satisfy condi- 
tions (2.1) and (4.4). The theorem is thus established. 

5. An area-reducing alteration. We shall in studying the problem of 
Plateau have need of one further lemma. 


Lema 3. Let the continuous surface 


(5.1) S: «x = x(u,v), y = y(u, v), 2 = 2(u, v), u? + vo? S 1, 


have finite area. Then there exists a surface S:x=2(u, v), y= 2=2(u, 2), 
u°+v? <1, having the same boundary curve as S, and satisfying the conditions 


(S.2) L(S) L(S), 
(5.3) the functions £(u, v), 9(u, v), 2(u, v) are monotonic. 


We need only a slight modification of a proof due to Lebesgue.f Let us 
arrange the rational numbers in a sequence 7, 72, ---. The point set at 
which x(u, v) >r: is (if not null) an openf set, and consists of a finite or de- 
numerable set of maximal open connected sets. We disregard those sets 


ft Loc. cit., p. 382. 
t Except that it may contain limit points on the circumference. 


[July 


1933] PARAMETRIZATIONS OF SADDLE SURFACES 729 


which have points in common with the circumference u?+v?=1, and name 
the remainder Ri, R2,---. We treat similarly the point set for which 
x(u, ¥) <r; the maximal connected open point sets, interior to the circle, 
on which x(u, v) <r we call T;, Tz, - - - . We designate by x“ (u, v) the func- 
tion equal to 7, on Ri+7:+R2+72+ - - - , and equal to x(u, v) elsewhere; 
this is a continuous function. Moreover, the area of the surface 


(5.4) SO: x = x)(u, v), y = y(u, v), 2 = 2(u, v) 


is at most equal to L(S). This is obvious if S is a polyhedron, for then the 
images of Ri, 71, etc., under (5.1) consist of a finite number of triangles, and 
the images under (5.4) are the projections of those triangles on the plane 
«=n. For the general case, we select (as is always possible) a sequence {II,} 
of polyhedra represented in the form 


(5.5) X= = Yn(M, 0), = v), + S 1, 


such that lim L(II,)=Z(S), and lim x,(u, v)=x(u, v) uniformly on the 
circle, with similar statements for y and z. We obtain x‘)(u, v) from x,(u, v) 
as above, and denote by II) the polyhedron 


= = = v), + 0? S 1. 
Then lim I? =S®, and L(II?) < L(I1,); hence 
L(S™) s lim inf L(t) lim = L(S). 


We now obtain x (u,v) from x(u, v) as we obtained x") (u, v) from 
x(u, v), the number 72 taking the place of 7. For the corresponding surface 
S® we have L(S®) <L(S). Likewise we obtain x® from x, using r3 in- 
stead of rz; and so on. The sequence of functions {x(u, )} converges uni- 
formly to a limit function £(u, v), which is continuous and monotonic; the 
proof of this is identical with that given by Lebesgue}, and we shall not re- 
peat it. We wish to emphasize two points; first, the functions x(u, v) are 
all equal to x(u, v) on the circumference u?+v?=1, so that £(@) =x(@); and 
second, the surface S, defined by the equations x=Z(u, v), y=y(u, 2), 
z=2(u, v) has area at most equal to L(S), since 


L(S;) lim inf L(S™) L(S). 
The whole process being repeated for the function y(u, v), we obtain a 
monotonic function #(u, v), equal to y(u, v) on the circumference, and such 


that the surface defined by the equations x=Z(u, v), y= v), z=2(u, 
has area at most equal to L(S). Finally we repeat the whole process for 


T Loc. cit.; beginning at the middle of p. 382. 


730 E. J. McSHANE, | [July 


z(u, v), and obtain a monotonic function 2(u, v), equal to z(u, v) on the 
circumference u?+2?=1, and such that the surface S defined by the equa- 
tions 

x = E(u, v), y = v), 2 = Z(u, v), + 0? <1, 
has area L(S) SL(S). The lemma is thus established. 

6. The problem of Plateau. Let IT be any Jordan curve in xyz-space; we 
designate by a(I’) the greatest lower bound of the areas of all continuous 
surfaces S bounded by I’. On the other hand, let {II,,} be a sequence of poly- 
hedra whose boundary curves tend to I, and consider the quantity lim inf 
L(Il,). The greatest lower bound of this quantity for all such sequences 
{11} is called the minimum area of I; we denote it by m(I’). There is no 
essential restriction in assuming that each polyhedron II, of the sequence be 
bounded by a Jordan polygon; for given II,, we can alter it so as to make the 
boundary non-self-intersecting while changing the area and displacing the 
boundary by arbitrarily small amounts. 

It is easy to show that 


(6.1) m(T) a(f). 


For let S be any continuous surface bounded by I’, and let {II, } be a sequence 
of polyhedra tending to S and such that L(II,) tends to L(S). The boundaries 
I’, of the polyhedra II, tend to I’, and therefore 


S lim inf = L(S); 
this being true for every surface S bounded by I’, inequality (6.1) follows im- 
mediately. We shall later prove (as is already known}) that a(T) =m(T) 
for every Jordan curve I. 
We now proceed to prove 
THEorEM{ II. For every Jordan curve T whose minimum area m(T) is 
finite, there exists a continuous surface 


(6.2) S: x = x(u,v), y = y(u, v), 2 = 2(u, v), w+ 0? S 1, 

bounded by T and satisfying the following conditions: 

(a) the area of S is the least possible among all surfaces bounded by I, i.e., 
(6.3) L(S) = = 

(b) the functions x(u, v), y(u, v), and z(u,v) are analytic, and in fact harmonic, 
for u?+v? <1; 

(c) the surface S is a minimal surface. 


t Radé, loc. cit., p. 776. 
t Douglas, loc. cit.; Radé, loc. cit., p 791 


1933] PARAMETRIZATIONS OF SADDLE SURFACES 731 


Let {II,} be a sequence of polyhedra having areas L(II,) tending to 
m(T), and bounded by Jordan polygons I, tending to [. By Lemma 3, for 
each m we can find a surface S, bounded by I, and of area L(S,) < L(I,) 
for which the representing functions %,(u, v), Jn(u, v), Zn(u, ¥) are mono- 
tonic. Now let A,*, A.*, A;* be three distinct points on the unit circum- 
ference u?+v?=1, and let Ai, As, As be three distinct points on I’. Since 
r,, tends to I’, we can on each I’, select three points A, A%, A‘ such that 
A” tends to A; (i=1, 2, 3). By Theorem I there exists a representation 


(6.4) Xn(U, v), Yn(U, »), Zn(U, v), u? + v 1, 
of S, such that the functions x,(u, v), etc., satisfy conditions (2.1), and 
(6.5) E, =G, and F, = 0 almost everywhere. 


Moreover, we may assume that equations (6.4) carry the points A,*, A,*, A;* 
into A%, AS, A® respectively; for if the A™ correspond under (6.4) to BY 
(t=1, 2, 3) on the unit circle, then by a conformal mapping u=x(d, 4), 
v=0(%, 0), we can map the B;* on the A,*, and by Theorem I the new 
functions x(u(a, 3), etc., continue to satisfy conditions (2.1) and 
(6.5). By (6.3), we have 


f 3(En + G,)dudv 


J J (E,G, — F,2)"!? dudv = L(S,) 


hence the first expression in (6.6) is bounded. The sequence {5,} therefore 
satisfies the hypotheses of Lemma 2, and so there exists a representation 
«=2x(0), y=~y(0), of the curve and a subsequence of {S,} (for which 
we retain the same notation) such that 


lim x,(0) = x(6), lim y,{6) =  limz,(@) = 2(6) 


uniformly in @. 

The surface S, has one representation x=Z,(u, v), etc., in which the 
functions Z,(u, v), etc., are monotonic; hence by §1 we know that in the 
representation (6.4) of S, the functions x,(u, v), etc., are monotonic. The se- 
quences {x,(u, v)},{yn(u, v)}, {2n(u, v)} are therefore seen to satisfy all the 
hypotheses of Lemma 1; hence there exist monotonic functions £(u, 2), 
¥(u, v), 2(u, v) defined on the unit circle such that 

lim x,(u, = (u,v), lim ya(u, v) = 2), 


(6.7) 
lim z,(u, v) = 2(u, v) 


732 E. J. MCSHANE 


uniformly on the whole circle. Consider now the surface 
Si x = £(u,v), y = (u,v), 2 = 2(u, v), +o S 1; 


its boundary curve is a limit curve of the I’, since the convergence in (6.7) 
is uniform on the circumference u*+v*=1, and since the I’, have the unique 
limit T', the boundary curve of S is T itself. Moreover, by the semi-continuity 
of the Lebesgue area we have 


(6.8) L(S) lim inf L(S,) S lim = m(T). 
But since S is bounded by I, we have 
(6.9) L(S) 2 a(P); 


comparing inequalities (6.8), (6.9) and (6.1), we have L(S) =a(T) =m(T), 
which establishes equation (6.3). 
By Theorem I, there exists a representation 


(6.10) x = x(u, y = y(u, v), 2 = 2(u,0), +o? = 1, 


of S for which conditions (2.1) are satisfied, and further 


(6.11) E =G and F = 0 almost everywhere. 
Thent 


(6.12) L(S) = ff (EG — F*)'?dudv = ff 3(E + G)dudv. 
utpor<l 


From this it follows that x(u, v), y(u, v), 2(u, v) are harmonic. For suppose, 
e.g., that «(u, v) is not harmonic, and let £(u, v) be the harmonic function 
having the same boundary values as x(u, v). The function  minimizesf the 
Dirichlet integral for the given boundary values, and is the unique minimiz- 
ing function; hence for the surface 


S: «x = &(u,v), y = y(u, v), 2 = 2(u, v), w+ 0? S 1, 


we have E+G<E+G. Therefore 


L(S) = f f (EG — F?)'/*dudv < f f 1(E + G)dudv 
utt+v?<1 


< Sf + G)dudv = L(S) = a(f). 


(6.13) 


t McShane or Morrey, loc. cit. in §4. 

t Lebesgue’s proof of the minimizing property (Bulletin de la Société Mathématique de France, 
vol. 41 (1913); p. 48 of the Comptes Rendus) can easily be modified to show that ¢(u, v) minimizes 
the Dirichlet integral in the class of all functions having the given boundary values and satisfying 
conditions (2.1). 


[July 


1933] PARAMETRIZATIONS OF SADDLE SURFACES 733 


But S is bounded by I, hence L(S)2a(T), contradicting (6.13). Hence 
a(u, v), y(u, v), 2(u, v) are harmonic. 

Since E, F, G are now seen to be continuous, equations (6.11) imply E=G, 
F =0 everywhere in the unit circle. By a theorem of Weierstrass we know that 
if a surface S is so represented that E=G, F=0, the surface S is minimal if 
and only if the functions x, y, z are harmonic; these conditions being here 
satisfied, our surface S is a minimal surface, and the theorem is proved. 

Between the present solution of the problem of Plateau and that given by 
Radé there remains one point of difference. We have not shown that our 
equations (6.2) carry the circumference of the unit circle in a one-to-one 
way into the curve I’. We can however very easily establish this by the 
same device as was used by Rad6,} to whose work we refer the reader. 


T Radé, loc. cit.; in particular, chapter 2, §3, No. 9. 


UNIVERSITY OF CHICAGO, 
Cuicaco, 


THE LATIN SQUARE, OR CYCLIC, FUNCTIONS* 


BY 
E. T. BELL 


1. Introduction. Special cases of the Latin square functions defined in 
this paper have recently come into some prominence in connection with 
generalizations by Humbert and others (references in §5) of the partial dif- 
ferential equations of mathematical physics. In solving the equations, the 
functions of y—1 independent variables defined by Appell*} in 1877 appear, 
and these in turn are intimately connected with Olivier’s't functions 
fo(x), - - - whose generating identity is 


(1.1) exp ax = fo(x) + afi(x) + a™'f,_1(2), 
where a is an imaginary rth root of unity, 
(1.2) f(x) = 


the summation referring to all integers ;=0 such that »;=j7 mod r. We shall 
call r the base of f;(m). Appell’s functions A, can be defined by expanding the 
left member of the following identity as a power series in a,and reducing the 
result modulo a’ —1, 


(1.3) exp ( a's.) = (1,°°* 


s=1 t=0 


The r functions A,(x, - - - , x,-1) =A; are connected by the identical alge- 
braic relation 


(1.4) N(Ao,- ++, Ari) = 1, 
where NV (yo, , Yr-1) is the norm of the algebraic number 
Yo + arly, 4. 


As the partial differential equations mentioned have no immediate phys- 
ical significance, there is no apparent reason for stopping short of the general 
case. In a previous paper’ the functions defined by reducing the left of (3) 
modulo P(a), where P(a) is any polynomial in a, were introduced and some 
of their properties discussed. The norm property (1.4) does not hold for these 
functions, except in the very degenerate case when they become Appell’s 


* Presented to the Society, March 18, 1933; received by the editors December 27, 1932. 
7 Numbers refer to bibliography in §5. 


734 


THE LATIN SQUARE, OR CYCLIC, FUNCTIONS 735 


It will be interesting to see what replaces the norm property, and how it de- 
generates in the special case. 

We shall see that the generalized norm property is intimately connected 
with Latin squares. A Latin square of degree m is a square array of m distinct 
elements such that no element occurs twice in the same column. The number 
of Latin squares of degree m, no two of which can be derived from one another 
by a permutation of rows or of columns, will be denoted by \(m). This number 
has not been determined for general m, and even for small m the labor of a 
direct determination is prohibitive (see MacMahon‘). As observed by Cay- 
ley,? not every Latin square of given degree can be generated by a group of 
substitutions on the elements of a given row. Thus there exist (even for m 
small) Latin squares with which no group is associated. 

The norm relation is replaced for the generalized functions of r independ- 
ent variables by \(r) algebraic relations, each of which is derived from a Latin 
square of degree r. When the functions degenerate to Appell’s (based on rth 
roots of unity), the A(r) relations coalesce in the norm relation, and the single 
Latin square corresponding to this relation is generated from its first row by 
the cyclic group of degree r. 

Appell’s functions are a simple generalization to functions of r independ- 
ent variables of the circular and hyperbolic functions. The Latin square 
functions pass at once to the most general situation possible of this kind, 
namely to the functions of r independent variables constructed from poly- 
nomials in the members of sets of r linearly independent solutions of equa- 


tions of the type 

d’y 

+ cy 

dx’ 
where ¢;, - - - ,¢, are arbitrary constants, instead of from the degenerate case 
c,= —1,c;=0,7r. The coefficients in the power series for Olivier’s functions, 
on which Appell’s are based, are periodic. In the generalized functions perio- 
dicity, ¢(n+r) =¢(n) for all integers n, is replaced by 


o(n+r) = 0, 


which becomes periodicity in the degenerate case. 

All the functions defined are obviously continuous and convergent ab- 
solutely for all finite values of the variables. 

2. Generalized Olivier functions. Consider first the generalization of Oliv- 
ier’s functions. Let 


(2.1) P(a) =a’ + 6, 


be irreducible in the rational domain. Reduction modulo P(a) of the expan- 


+---+cy =0, 


736 E. T. BELL {July 


sion of exp a*x, where s is an integer, defines the functions /;(7) uniquely, 


(2.2) expatx = *), 


i=0 


since P(a) is irreducible. We write 


(2.3) 0,---,7— 8). 


The notation in (2.1) is fixed throughout the paper. 
The jth fundamental sequence ¢;(m), m=0, +1, +2, - - - , defined by the 
difference equation 


(2.4) o(n+r) + = 0, 
whose characteristic equation is P(a) =0, is determined by 
(2.5) o;(k) = 6* (Kronecker delta), j,k = 0,---,r—1. 


The ¢;() are a set of r linearly independent solutions of (2.4), and the general 
solution is 


r—1 
(2.6) o(n) = 
j=0 
The notations in (2.4), (2.5) are fixed henceforth. 
For all integers m we have 


(2.7) = Yaig,(n). 


j=0 


Hence, by (2.2), 


n 


(2.8) n(*)- =0,---,r—1). 


n=0 


To find the differential equation satisfied by the functions (2.8), let P,(a) 
be the polynomial with leading coefficient unity whose roots are the sth 
powers of the roots of P;(a) =P(a), 

(2.9) Pa) = + + - - - + (6 = a’). 
Then, by (2.2), (2.9), r linearly independent solutions of 

(2.10) + c:(s) 


dx’ dx*-! 


are the functions (2.8), and the general solution of (2.10) is 


1933] THE LATIN SQUARE, OR CYCLIC, FUNCTIONS 


r—1 x 
(2.11) y= 
j=0 
where the &; are arbitrary constants. 
The exponential forms of the functions (2.8), corresponding to those of the 
circular, hyperbolic, and Olivier functions, are obtained at once from (2. . 
If ao, - - - , @ are the roots of P(a) =0, and a‘/ denotes the cofactor of ak 


in the determinant 


we have 


t=0 


since D(a) #0, P(a) being irreducible. 
Corresponding to the period recurrence of the derivatives of the circular, 
hyperbolic, and Olivier functions, we have here 


(2.13) )- Deus + *) (4 = 0,---,7— 1), 


j=0 


on differentiating (2.2) ¢ times and applying (2.7). 
Applying (2.7) to the product of exp a*x and exp a*y, we get the addition 
theorems 


p=0 k=0 


There is no algebraic addition theorem with respect to s. 
Let a be any root of P(a) =0. Then 


exp [xa*P(a)] = 1, 


and hence, by (22), the identical algebraic relations between the functions are 
obtained by reducing the expression on the left of the following, modulo P(a), 
to that on the right (co=1), 


the relations are 


737 
r—1 
1 ao +++ Qo 
r—1 
1 a a 
D(a) = | 
r—1 
1 * Op} 


738 E. T. BELL [July 


AY 


5 


For Olivier’s functions it is easily seen that (2.16) are equivalent to the single 
norm relation (the last r—1 relations are absent). 

If P(a) is such that, for some integer s >0, ci(s) =0 in (2.9), the functions 
Ii), 7=0, - - -,r—1, are more simply connected. Let the roots of P,(a) =0 
be Bo, - - - , 8-1. If B is any one of the roots, we may define functions g;,(/) 
by the process for (2.2) with P(a) replaced by the right of (2.9), 


(2.17) exp Bix = 


j=0 


Apply (2.7) to B’=a’*. Then 


(2.18) n(*) = *) 


St j=0 
= g,(x). 


If now c:(s) =0, Bo+ - - - +8,1=0, and exp - - - Hence 
(2.19) M(g0(x), &r—1(X)) i, 


where Mo(yo, - - , yr-1) is the norm of yo+yi8+ - - - If further 
s =1, (2.16) hence becomes 


(2.20) No(fo(x),---, = 1, 


where No(yo, - - - , is the norm of yotyiat+ A linear 
transformation on the f;(x) will always reduce (2.16) when s=1 to (2.20). 

Since there are precisely r functions f;(*) of the single variable x, they 
must be connected by r—1 relations. These are contained in the 7 relations 
(2.12), which are not independent, or in the equivalent dependent set ob- 
tained from (2.2) by putting a, - - - ,a@,-1 successively for a. The dependence 
for the last set of r is evident from —c,(s)=ao'+ - +a;-1; is a ra- 
tional function of ¢:, - - - , ¢--1. The r—1 independent relations are transcen- 
dental. 

3. Functions with periodic coefficients. A special case of the functions (2.3) 
is of particular interest as it can be completely specified with remarkable sim- 
plicity. In a previous note!’ it was shown that the only difference equations 
(2.4) whose solutions have the proper additive period m (integer >0) are 
those in which r=7(m), the totient (Euler’s function) of m, and P(a) =0 is 


1933] THE LATIN SQUARE, OR CYCLIC, FUNCTIONS 739 


the equation whose roots are the 7(m) primitive mth roots of unity. In this 
section m is a constant integer >0, r =7(m), and 


(3.1) a’ 
is the equation for the primitive mth roots of unity. All of §2 necessarily 


holds in this case, with special features not valid in §2. The notation is as 
before; in particular the general solution of 


(3.2) o(n +r) + =0 


is ¢(n). The sequence $(m) (n=0, +1, -- -) is determined by (3.2) when 
¢(0), - - - ,@(r—1) are given constants. 
From what has just been recalled it follows that the only functions 


(3.3) f(x) = Lv(n)ar/n!, 

n=0 
in which ¥(m) has the proper additive period m and is determined by a linear 
difference equation with constant coefficients, are those in which y =¢, 


f(x) = 


and hence 


(3.4) f(a) = 


(nm + 2)! 


t=0 


The functions in square brackets, say 


bed anmtt 


3.5 h = 

are the m Olivier functions to the base m; see (1.2). Hence, by (2.6), the gen- 
eral function (3.3) with periodic coefficients of the kind described is 


(3.6) han De |. 


j=0 t=O 


Consider the functions in the square brackets in (3.6), 


m—1 


(3.7) H,(x) = G=0,---,7— 1). 
t=0 
The generating identity is 
r—1 


exp ax = dail (x), 


j=0 


n=0 


740 E. T. BELL 


where a is any root of (3.1). Hence, taking the mth derivative, we get 


a” r—1 
(3.8) Hx) = + 
dx” 
and therefore, by the periodicity of ¢;, 


n 


(3.9) H (x) = —H,(x) 


dxkmtn dx” 
(n =0,---,m—1;7 =0,---,r—1) 


for all integers k>0. Thus the derivatives of the H;(x) recur with the period 
m. Since $(0), - - - ,@(r—1) in (3.6) are arbitrary constants, (3.9) implies 


qkmtn 
(3.10) = Sie) 
and f(x) is the most general function with recurring derivatives of period m. 

Consider next the functions (3.3) in which ¥(m) has the proper multiplica- 
tive period m+1 (m integer >0), and in which ¥(m) is determined by a linear 
difference equation with constant coefficients. It follows from the theorem 
recalled for additive periodicity that the only such ¥(m) with multiplicative 
period m+1>1 are the ¢(m) defined by (3.1) as before. Hence the properties 
of these functions follow from those just discussed. 

4, Latin square functions. In this section the notation is as in §2. We shall 
need particularly (2.1)—(2.4). As a basis for the numbers of the field K(a) we 
shall take 1, a, - - - , a’, and we shall denote the element of K(a) whose 
coordinates are %, , X,-1 by (x), 


(4.1) (x) => (xo, + Xp—1) = Xo ax, + + 
The sum of (x), (y) may be written either as (x) +(y) or (x+y), 
(4.2) (x + y) = (x0 + Yo, + 
their product, (x)(y) or (xy), is 

(xy) ((xy)o, (xy)r-1), 
(4.3) r—1 r—1 

(xy); = p)xiyo. 


i=0 p=0 


More generally, the product of any finite number of elements (x), (y), - - - 

(z) of K(a), in a similar notation, is given by 

(4.4) (xy---2);= 2%, 


[July 


1933] THE LATIN SQUARE, OR CYCLIC, FUNCTIONS 


The element (x)’ of K(a) defined by 
(x)! = (x) — xo + x0" = 
(4.5) = — KE = Le — -++,r—1), 
(x)! = ax, + +--+ + 


will be called the curtate of (x). Accents as in (x)’, (y)’, - - - shall denote the 
curtates of the corresponding (x), (y),---. 
The Latin square functions of degree r in the independent variables x, 
- , x, are denoted by L;(x:, - - - , x,), and are defined by the identity (4.6), 
in which the right is the reduction modulo P(a) of the expansion of the left as 


a power series in a, 
1 


(4.6) exp (x)’ = (x, 
7=0 
To find the algebraic relations between the L; mentioned in §1, we pro- 
ceed as described presently from the Latin square (4.7) to its “bordered 
mate” (4.8). We assume r2=1. Let x, - - - , x be the ith row of the Latin 
square (4.7) of degree r constructed from - - - , x,, so that x, ---, 
is some permutation of x, - - , 


(4.7) 


Write —s=x,:+a.+ ---+2,. Multiply the elements in the jth column, 
j>1, of (4.7) by c;1. Apply a’, a—', - - - , a as top border to the result, and 
S, , as a bottom border: 


r—1 
a ’ 


Consider the rows of (4.8) as vectors and take the inner product of the vector 
whose coordinates are the top border by each of the remaining r+1. The sum 
of these r+1 inner products vanishes, as it is (vi+ - - - +2,) P(a), from the 


741 
|_| 
(2) (2) (2) 
(r) (r) (r) 
» 5° ** 
a’, | 
(1) (1) ql) 
142 5° °° 5 Gr—14r 
(4.8) 
(r) (r) (r) 
5, CS, Cras 


742 E. T. BELL 


construction of (4.8). These products are the curtates 


(i) (i) @,, 
(4 9) » Cr—aX%p-1, * , ) 


To simplify the writing, let the r+1 curtates in (4.9) be equal respectively to 


(i) (i) (i) 
(4 10) (v1 y’ 
(r+1) (r+1) (r+1), 
en 


r+1 
p=1 


and hence, by (4.6), 


r+1 . 
(4.11) II | Li =1. 
p=1 ip=0 

When distributed and reduced modulo P(a), the left of (4.11) is of the 
form No+aNi+ - - - +a” where N; is a homogeneous polynomial of 
degree r+1 in functions Lo, Zi, - - - , L,-1 whose variables are given in (4.10). 
For the moment the structure of the NV; need not be considered. Starting then 
with the particular Latin square (4.7), we reach the identical relations 


(4.12) No =1,N; =0 


We indicate the structure of the N’s presently. 

From (4.5), (4.6) we find explicit forms for the Z;. The expression for the 
L; corresponding to (2.12) is obvious and can be omitted. Let 00, - - - , 0,1 
be the r conjugates of (x)’, including (x)’. Form the equation 


(4.121) r+ +---+5,=0 (0 = , O-1) 


whose roots are these conjugates. Then b;=b;(x, - - - ,.x,) is a homogeneous 
polynomial of degree j in x, - - - , x,, whose coefficients are polynomials in 
(i, °°, ¢ with rational integer coefficients. Similarly to the discussion for 
(2.3)—(2.5) we consider the difference equation 


(4.13) + = 0, 


whose characteristic equation is (4.121). The r fundamental sequences £;(m) 
for (4.13) are determined by 


(4.14) E(k) = (j,k =0,---,r—1); 


[July 


1933] THE LATIN SQUARE, OR CYCLIC, FUNCTIONS 
the general solution £(m) is 


(4.15) = 

and we have 

(4.16) 9" = 

The powers of 6 on the right of (4.16) must be reduced modulo P(a) independ- 
ently. Let 


Then pj:= +++, %,) is a polynomial in x, - - - , x, whose coefficients 
are polynomials in c, - - - , ¢, with rational integer coefficients. From (4.16) 
we now have 


(4.18) = | 
i=0 


j=0 
hence, by (4.6), 
(4.19) 


The p;; are defined by (4.17), and the &;(m) are the fundamental solutions 


of (4.13). 
Since the variables x1, - - - , x, are independent, the differential relations 
of §2 go over, by (4.6), to corresponding relations for the L;. Thus from (2.10), 


(4.6) we have 


or 
(4.20) | + ¢1(s)—— + + cs) Xr) =0 


Ox," 


and corresponding to (2.13), 


ot r—1 


Xs i=0 


whence 


o™ 
( %) = 0 (m,n = 1,--- 
OXm 


x) = 2, G =0,---,r—1). 
i=0 n=0 


744 E. T. BELL [July 


The expressions for the ZL; as polynomials in the functions defined in 
(2.8) follow at once from (4.4)—(4.6), 


+, %) --- + jue (*) Rod ), 
0 


the sum extending to all 0<j,,---,j,:Sr—1. The N’s in (4.12) havea 
similar structure in terms of L’s. The addition theorems are of the same 
type, but simpler, * 


F(x, + *** Xe + Yr) + k)F (x1, Xr) Yr), 


summed for 0<j, k<r-—1. 

From this point on, the connection with partial differential equations is 
of the same kind as that for the Appell functions and the equations discussed 
by Humbert and others in the papers cited in §5. The note ™ sufficiently 
indicates the start. 

5. References. Several of the following papers contain further references 
to the literature of Appell’s functions and their connection with differential 
equations. The references are given in chronological order. Humbert (loc. cit., 
p. 153) attributes Olivier’s functions to Yvon Villarceau, without stating the 
reference. 

1. L. Olivier, Bemerkungen iiber eine Art von Funktionen ..., Crelle’s 
Journal, vol. 2 (1827), pp. 243-251. 

2. J. W. L. Glaisher, On functions with recurring derivatives, Proceedings 
of the London Mathematical Society, vol. 4 (1872), pp. 113-116. 

3. P. Appell, Sur certaines fonctions analogues aux fonctions circulaires, 
Comptes Rendus, Paris, vol. 84 (1877), pp. 1378-1380. 

4. J. W. L. Glaisher, Functions analogous to the sine and cosine, Quarterly 
Journal, vol. 16 (1879), pp. 15-33. 

5. A.Cayley, On Latin squares, Messenger of Mathematics, vol.19 (1890), 
pp. 135-137 (Collected Papers, vol. 13, No. 903). 

6. P. A. MacMahon, Combinatorial Analysis, vol. 1, 1915. 

7. E. T. Bell, Periodic functions of n variables connected with an algebraic 
number field of degree n, Quarterly Journal, vol. 50 (1927), pp. 314-328. 

8. P.Humbert, Sur une généralisation de l’ équation de Laplace, Journal des 
Mathématiques, (9), vol. 8 (1929), pp. 145-159. 

9. D. V. Jonescu, Sur une équation aux dérivées partielles du troisiéme 
ordre, Bulletin de la Société Mathématique de France, vol. 58 (1930), 
pp. 224-229. 

10. E. T. Bell, Periodic recurring series, Proceedings of the National 
Academy of Sciences, vol. 16 (1930), pp. 750-752. 


1933] THE LATIN SQUARE, OR CYCLIC, FUNCTIONS 745 


11. J. Devisme, Comptes Rendus, Paris, vol. 193 (1931), pp. 981-983, 
825-828; ibid., vol. 194, pp. 516-519. 

12. P. Humbert, On A ppell’s function P(6, ¢), Proceedings of the Edin- 
burg Mathematical Society, (2), vol. 3(1932), pp. 53-55. 

13. J. Devisme, Sur la fonction générairice de la fonction P(m0, nd) 
d’ Appell, Académie Royale de Belgique, Bulletin, (5), vol. 18 (1932), 
pp. 505-506. 

14. E. T. Bell, A Laplacian equation, American Mathematical Monthly, 
vol. 39 (1932), pp. 515-517. 


CALIFORNIA INSTITUTE OF TECENOLOGY, 
PASADENA, CALIF. 


SUFFICIENT CONDITIONS IN THE PROBLEM OF 
THE CALCULUS OF VARIATIONS IN n-SPACE 
IN PARAMETRIC FORM AND UNDER 
GENERAL END CONDITIONS* 


BY 
SUMNER BYRON MYERS 


1. Introduction. Sufficient conditions in the general problem of the cal- 
culus of variations .in parametric form are given here. The results are in 
terms of the characteristic roots of a linear boundary value problem, and are 
in close relation to the conditions recently given by Morse? in the correspond- 
ing problem in non-parametric form. 

An important feature of the results is that the usual “non-tangency” 
hypothesis is not made. For example, if these results were applied to the 
problem of minimizing an integral along curves joining a point to a manifold, 
we would obtain sufficient conditions for a minimum even in the case that 
the minimizing curve is tangent to the manifold. 

The essential idea in the methods used in the paper is the treatment of 
the parametric problem as the limiting case of a series of non-singular non- 
parametric problems by means of a suitable modification of the integrand. 
Although they lack the geometric invariance of methods now being developed 
by Morse, in which the parametric problem is approximated by means of a 
series of parametric problems of the same nature as the original problem, the 
methods and results of this paper derive advantage from the non-singularity 
of the approximating non-parametric problems and from the fact that the 
cases of “non-tangency” and “tangency” are treated together. The work of 
the author and that of Morse are thus complementary, and constitute the 
first complete treatment of sufficient conditions in the general parametric 


problem. 


* Presented to the Society, March 26, 1932; received by the editors April 19, 1932. 

¢ Certain results in the following papers will be used. 

Morse, Sufficient conditions in the problem of Lagrange with variable end conditions, American 
Journal of Mathematics, vol. 53 (1931), pp. 517-546. 

Morse and Myers, The problems of Lagrange and Mayer with variable end points, Proceedings of 
the American Academy of Arts and Sciences, vol. 66 (1931), pp. 235-253. 

Bliss, Jacobi’s condition for problems of the calculus of variations in parametric form, these Trans- 
actions, vol. 17 (1916), pp. 195-206. 

Further references can be found in the three papers just cited. 


746 


SUFFICIENT CONDITIONS IN THE CALCULUS OF VARIATIONS 747 


2. The Euler equations and the transversality conditions. In the space 
of the variables 
(x) (x1, Xn) 
let there be given an ordinary arc g 
(2.1) #,(t), sis 1@ 


of class C’. 
We consider ordinary arcs of class D’ neighboring g. The initial and final 
end points of such arcs will be denoted respectively by 


(x*) (x1°, (s 1, 2) 


and the end values of the parameter ¢ will be denoted respectively by ¢ 
(s=1, 2), where s=1 at the initial end point and s=2 at the final end point. 
An ordinary arc of class D’ neighboring g will be said to be admissible if its 
end points are given for some value of 


(a) = (a1, @) 
by the functions 


(2.2) x? = x#(a1,---,a), OF SrS2m (i =1,--+, = 1, 2). 


These functions of (a) are of class C’’ for (a) near (0) and reduce to the end 
points of g for (a) =(0). We assume that the functional matrix of the func- 
tions in (2.2) 


= 1, 2) 


is of rank r for (a) =(0). Here and henceforth the subscript / attached to 
x; shall denote differentiation with respect to ay. 

We seek conditions under which the arc g and the set (a) =(0) afford a 
minimum to the expression 


(2.3) J= + O(a) 


among sets (a) near (0) and admissible arcs neighboring g with end points 
determined by these sets (a). The function F(x, <) is defined for (x) in an 
open region containing g and for (%) any set not (0), and is to be of class C’’’. 
The function @ is to be of class C’’ for (a) near (0). 

Furthermore, the function F is to satisfy the usual homogeneity relation 


(2.4) F(x, = kF(x, %), 


* The case r=0 yields the fixed end point problem. This case will be treated separately at the 
end of the paper, so that until then we shall assume that r>0. 


748 S. B. MYERS (July 


Certain necessary conditions are obtained immediately by treating the 
problem as a non-parametric problem of minimizing J among curves of class 
D’ in the (n+1)-space of the variables (¢, x) whose end points satisfy (2.2) 
and the conditions .* 

THEOREM 1. If g affords a minimum to J in the problem, then along g the 
following equations must be satisfied: 


df OF OF 
(2.5) 
dt Ox; Ox; 


while the following transversality relations must hold: 


OF , 2 
(2.6) Fa +6,=0f (h=1,---,r;i=1,---,n). 
1 


We shall now state and prove a theorem which will be useful later. 

THEOREM 2. For an arbitrary set of functions n,(t) of class D’ such that 
ni(t®) =x nu, (i=1,---,m; h=1,---, 7; 2) for some set of numbers 
(u1, - - + , Ur), there exists a one-parameter family of admissible arcs 


(2.7) x; = x(t, e), a, = a,(e) 


containing g for e=0, with y(t) and uy as its respective variations; that is, the 
functions in (2.7) will have the following properties: 

xi(t, 0) = &,(t), 

= xi[a(e) |, 
(2.8) a,(0) 0, Xielt, 0) ni(t), ax (0) = Up 

5 =1, 2). 
Furthermore, the functions x(t, e) and x;.(t, e) are continuous and have con- 
tinuous derivatives with respect to e for e near 0 and t in the interval t <t<t™, 
while the functions x(t, e) and xie(t, e) have the same properties except possibly 
at the values of t defining the corners of (n). The functions a,(e) are of class C". 
For the following is such a family: 
a, = #:(t) + e[ni(t) — — + [x2 (eu) — 22] 20) 
+ [x2(eu) — #2 


(2.9) 

* See Morse and Myers, p. 245, loc. cit. 

t Here and henceforth [ _]}' shall mean the difference between the value of the bracket evalu- 
ated for s=2 and (x, <) at the final end point of g, and the corresponding evaluation at the initial 
end point of g. Also, an index repeated in the same term shall always mean summation with respect 
to that index. The notation @, stands for (0/da,)(0). 


1933] SUFFICIENT CONDITIONS IN THE CALCULUS OF VARIATIONS 749 


where h'(#), h?(#) are any functions of class C’ such that 
= 0, = 0, h(t) = 0, h?(t®) = 0, 


while 7, is an abbreviation for 7,(¢“) and #,* is an abbreviation for <,(t). 
3. The accessory boundary problem and a further necessary condition. 
We assume now that g is an extremal satisfying the transversality conditions 
(2.6). We shall use permanently the notations 
ni(t) Xie(t, 0), 


(s) 
n=nit ), 


un = ax (0) 


Consider now a family of admissible arcs of form (2.7) satisfying the first 
three conditions of (2.8) and possessing the differentiability properties of 
Theorem 2. If we consider this family momentarily as a family of arcs in 
(t, x)-space satisfying the end conditions 


= (a), 


(3.1) 
= (¢=1,---,m;s =1, 2), 


then we can apply known results* to obtain the second variation of J along 
this family. We find that 


(3.2) J" (0) = + 2f w(n, (h, k 


where 


and 


oF , 
(3.4) Dak = + On 


Xi 1 


With the idea of dominating the sign of the second variation by adding 
new terms, we are led to consider the accessory problem of minimizing 


(3.5) I(n, o) = + [2w — + neni) |dt 

(1) 

(i=1,---,m;h,k =1,---,7) 
for a given number g, relative to constants (u) and functions (7) of class D’ 
satisfying 


* See Morse, p. 521, loc. cit. 


750 S. B. MYERS [July 


(3.6) ni = (i= =1,---,7). 

A solution (n), (u) of this new minimum problem in which the functions 
(n) are of class C’’ must satisfy the conditions of the following boundary 
value problem: 


d bd an 
dtLoni Oni 


on , 7 
(3.8) + [aa | (4, k = 1,---,9), 
Oni 1 


(3.9) ni = Xinttr (s = 1, 2), 
where 
(3.10) 2Q(n, 2w(n, 7) + nmi) (i 1, n). 

This boundary problem we shall call the accessory boundary problem. By 
a solution of the accessory boundary problem is meant a set of functions 
ni(t) of class C’’ which with constants (mu) and o satisfy the conditions of 
the problem. A characteristic solution is one for which (n) 4 (0). 

The corresponding value of ¢ will be called a characteristic root. 

The following lemma and theorem can be proved in a manner similar to 


that used by Morse in his proof of the corresponding results for the non- 
parametric problem.{ In the proof of Theorem 3, Theorem 2 must be used. 

Lema 1. If (n) is a characteristic solution with constants (u) and o, I(n, u, 
a) =0. 

THEOREM 3. If g furnishes a minimum for the given problem, there can exist 
no characteristic root 

4. The function J(a) and the quadratic form H(u, ¢). By the Legendre 
sufficient condition we mean the condition 

(4.1) 5; > 0 (4,7 =1,---,m) 


along g, for all sets (7) #(0) and not proportional to (0%/d?).° 
By the Weierstrass sufficient condition we mean the condition 


OF 
(4.2) E(x, x, 9) = F(x, 9) %) >0 (i 1, 


for all (x), (x) on g, and for all (¥) (0) and not proportional to (’). 


t See Morse, p. 254, loc. cit. 


1933] SUFFICIENT CONDITIONS IN THE CALCULUS OF VARIATIONS 751 


We shall assume henceforth that g is an extremal along which the Le- 
gendre sufficient condition holds. Among the well known consequences of 
this assumption are the following: 

(1) The determinant 

0 
along g. 
(2) The functions #,(é) are of class C’’’. 
(3) The characteristic determinant 
ar 
06; 
does not vanish for <0, where 6;/ is the Kronecker delta. 

A set (a) neighboring (a) =(0) determines through (2.2) two end points 
P and Q near the respective end points of g. If we assume for the moment 
that the end points of g are not conjugate, then P and Q can be joined by a 
unique extremal E, which is thus determined by the set (a). We can thus 
obtain a family of extremals determined by values of (a) near (0), and this 
family can be represented in the following form: 


(4.3) y= a) 
where x;* and x;,* are of class C’’ in (a) and satisfy the following conditions: 
(4.4) 0) #;(t) (i 1, n), 
(4.5) x*(t, a) = (s = 1, 2). 
The expression J taken along the extremals of the family (4.4) becomes a 
function J (a) of class C’’. 
The Euler equations (2.5) and the transversality conditions (2.6) enable 
us to prove that J(a) has a critical point for (a) = (0). 


The terms of the second order of J(a) are obtained by means of the 
following identity in the variables - - - , u,): 


aj 
(4.6) Taxa (O)Unte = (e=0) (h,k =1,---,7). 
e 


The right hand side of (4.6) is nothing but the second variation of the one- 
parameter family of extremals obtained from the family (4.3) by setting 
a, =euy, where u, is fixed and ¢ is variable. This one-parameter family has the 
form 


752 S. B. MYERS [July 
(4.7) e), an = CU, (i = i,- 


where 
(4.8) e) = x#(eu) = 1, 2). 


The second variation of the family (4.7) has the form (3.2), so that 


(4.9) Ja,a,(O)Unur + 2 w(n, n)dt (A, k= 1, r). 


A curve 7;=7;(t) of class C’’ in the space of the variables (¢, 7) will be 
called a secondary extremal if the functions (n) satisfy (3.7) for some o. At 
present we are concerned only with secondary extremals for o =0. 

To show the complete relation between (u) and (7) in (4.9), we need the 
following lemma. 


Lema 2. The integral f? wdt has the same value if evaluated along any two 
secondary extremals joining the same end points A:(t,, a) and B: (te, b). 


Suppose that (4) and (7) are the two secondary extremals. Then 


ne = + — (i=1,---,n) 


is a one-parameter family of secondary extremals joining A and B and con- 
taining (4) and (7). But the value of an integral taken along the members of a 
one-parameter family of extremals joining the same end points is the same 


for each extremal. 

Returning now to (4.9), we note that the functions 7;(#) in the argument of 
w define a secondary extremal Z’, since they are the variations of a family of 
extremals. The set (wz) in (4.9) determines the end points of Z£’; for upon 
differentiating (4.8) with respect to e and setting e=0, we obtain 


(4.10) = = 1, 2), 


and it is in this sense that the set (u) determines the end points of E’. 
From (4.9) and Lemma 2 we obtain the following theorem: 


THEOREM 4. Under the assumption that the end points of g are not conjugate, 
let J(c) represent the value of J taken along the extremal determined by (a). Then 
the terms of second order of J(c) have the form 


(2) 


t 
(4.11) = + 2f w(n, (h,k =1,---,7) 
+ 


where (n) may be taken along any secondary extremal with end points deter- 
mined by (u). 


1933] SUFFICIENT CONDITIONS IN THE CALCULUS OF VARIATIONS 753 


In order to bring the parameter a into the second variation as in (3.5), 
we replace the integrand F by a one-parameter family of integrands 


(4.12) F=F- =f [x; — #4(t)][xi — + — — 2/@]} 


whith we consider only for ¢ <0. For ¢ =0 we have our original problem in (x)- 
space, but for each «<0 we consider a non-parametric problem in (#, x)- 
space, the problem with the integral 
go 
Fdt 
and the end conditions 


When we talk about extremals, conjugate points, etc., for <0, these 
terms will always be understood to refer to the non-parametric problem in 
(t, x)-space. 

For each o <0, g:x;= ,(t) is still an extremal. We note that the problem 
for each ¢ <0 is non-singular; that is, along g the determinant 


| 


This is a consequence of the Legendre sufficient condition. If, then, we as- 
sume momentarily that for each o<0 the end points of g are not conjugate, 
a set (a) neighboring (0) will determine for each o <0 a unique extremal, and 


the expression 


J= + 6(a) 
becomes a function J(a, ¢). The following theorem is proved as was Theorem 
4. 


THEOREM 4a. Under the assumption that the end points of g are not conjugate 
for any o SO, let J(a, a) represent the value of J taken along the extremal deter- 
mined by (a) for any o <0. Then the terms of second order of J(a, ) have the form 


(4. 15) H(u, Jani, o)U = + 2f Q(n, o)dt 
(1) 


For « <0, (n) is taken along the secondary extremal determined by (u) through 
(4.10), while for o=0,(n) may be taken along any secondary extremal with end 
points determined by (u) through (4.10). 


754 S. B. MYERS 


5. Sufficient conditions for a minimum. Consider the expression 


I(n, u, o) = + 2 f Q(n, 7, (hk, k =1,-+-,7). 
) 


By an admissible set (u, 7) will be meant a set of constants (u) and a set of 
functions (7) of class D’ which together satisfy (3.9). 

THEOREM 5. For sufficiently large negative values of o, the expression 
I(n, u, 0) is positive for all admissible sets (u, n) (0, 0). 

First we note that since ||x;,|| is of rank 7, equations (3.9) can be solved 
for u, in terms of a subset of the variables 7;*. Hence for all admissible sets 
(u, 0) 


(2) 


(5.1) I(n, = g(a) +2 f a, 


where g(n) is a form quadratic in the variables ,*. From this it follows? that 


for all admissible sets (u, 7) 
(2) 


(5.2) I(n, u, 2 f [2w(n, + M(n, 7) — onni — oni |dt (i 1, n), 
a 

where M(n, 4) is a suitably chosen form quadratic in the variables (n, 7) 
with coefficient continuous in ¢. 

But any such form as the integrand in (5.2) can be made positive definite 
by making o negative and sufficiently large, independently of ¢. 

Thus for such a a, 
(S.3) I(n, u, a) >0 


for all admissible sets (u, ») (0, 0). 
Lema 3. Let (u, n) be any admissible set. If there is no point on g conjugate 
to its initial point for then 
(S.4) H(u, oo) S I(n, 4, 00), 
the equality holding if and only if (n) is a secondary extremal for ¢ =a.. 


By Theorem 4a the equality holds if (7) is a secondary extremal for ¢ =a». 
If (n) is not a secondary extremal, let (4) be the secondary extremal deter- 
mined by () for ¢ =a». 

We note that along any arc (n) ( sisi), 


(n, 9, Co) = = — (4,7 =1,---,m) 


T See Morse, p. 534, loc. cit. 


(July 


1933] SUFFICIENT CONDITIONS IN THE CALCULUS OF VARIATIONS 755 


which, by the Legendre condition, is positive for all (7) #(0). Also, if we use 
Taylor’s formula we see that the Weierstrass E-function 


Q(n, 7; Qn, 7; a (n, oo) (Hi ni) (i 


is equal to 
4,4, 
Fo)\Ni Ni)\NG ni 1, J 
and so is positive along any arc (n) and for all (4’) ¥(). 

These facts, together with the hypothesis that there is no point on g 
conjugate to its initial point for ¢=ao, enable us to infer that the secondary 
extremal (4) minimizes I(n, u, oo) in the fixed end point problem; that is, 
I(4, u, oo) <I(n, u, oo). The lemma follows from Theorem 4a. 


Lemna 4. If I(n, u, 0) (a0<0) is positive for all admissible sets (u, n)# 
(0,0), then there is no pair of conjugate points on g for ¢ =a». 


For if fg were conjugate to #, on g for ¢=a0, there would exist a secondary 
extremal (4) #(0) vanishing at ¢, and f&. Then J(n, u, oo) would be zero if 
evaluated for (w) = (0) and for (7) taken along the broken secondary extremal 
consisting of (#) in the interval t,t, and the ¢-axis in the remainder (if any) 
of the interval #7. This is contrary to hypothesis. 


Lemoa 5. If there is no point on g conjugate to its initial point for ¢ =0)<0, 
then there is no point on g conjugate to its initial point for o in the neighborhood 
of Co. 

For each ¢<0, the points conjugate to ‘=i are defined by the zeros 
of the determinant D(t, = |n;,(t, |, where ||7:;(¢, o)|| is a matrix 
each column of which represents a secondary extremal! for =o, and which 
satisfies the conditions 


= 67 (i,7 1, n; 69 = Kronecker delta). 


Now by means of the integral Law of the Mean, the function 7;;(¢, o) 
can be expressed in the form 


1 
= (t — f niz[t + — t), o (i, 7 = 1,---, 
0 


= (¢ — 
where a;;(t, is continuous for and «<0, and where 


= ol] = 6% (4, =1,---,m). 


S. B. MYERS 


= (¢ — t™)*! o)| . 


Since D(t, oo) #0 for  <t<t® by hypothesis, we see that |a;;(t, oo) | ~0 
for <t<t. It follows from the continuity of a,,(t, ¢) that | 
is ~0 in the interval t <t<t@ for o near oo. Hence D(t, ¢) #0 for o near oo 
in the interval  <¢<i™, and the theorem is proved. 


THEOREM 6. If there exist no negative characteristic roots, then I(n, u, 0) 20 
for all admissible sets (u, n). 


For o negative and sufficiently large, J(n, u, 7) is, by Theorem 5, positive 
for all admissible sets (u, ») (0, 0). Suppose we now increase o towards 
zero. Then I(n, u, ¢) either remains positive for ¢<0 and for all admissible 
sets (u, n) ~(0, 0), or else there is a least upper bound o)<0 of the values of ¢ 
for which I(n, u, 0) is positive for all admissible sets (u, ) ~ (0,0). We shall 
show that the latter case is impossible. 

Suppose there does exist such a least upper bound oo. Then either J(n, u, 
70) is positive for all admissible sets (u, n) ~(0, 0), or else I(n, u, oo) is zero 
for some such sets. If J(n, u, oo) is zero for an admissible set (#, 4) (0, 0) 
then (#, 4) must minimize J(n, u, oo) among admissible sets (wu, 7). Hence (4) 
must be a secondary extremal for o =a» satisfying (3.8) and (3.9), contrary 
to the hypothesis that there exist no negative characteristic roots. Thus 
I(n, u, oo) must be positive for all admissible sets (uw, n) # (0, 0). 

Lemma 4 then enables us to set up the quadratic form H(u, oo), which 
must be positive definite. By Lemma 5, we can set up H(u, ¢) for o slightly 
greater than oo, and it must be positive definite for o slightly greater than 
go. By Lemma 3, J(n, u, ) must then be positive for all admissible sets 
(u, n) ¥(0, 0) for o slightly greater than oo. This contradicts the hypothesis 
that oo is the least upper bound of the values of o for which J(n, u, @) is posi- 
tive for all admissible sets (u, 7) ~(0, 0). 

We conclude, then, that J(, u, a) is positive for all e<0 and for all 
admissible sets (u, 7) ~(0, 0). 

It follows, then, that J(n, u, 0) =O for all admissible sets (u, 7). 

A set of functions (7) will be called tangential if they are of the form 


(5.5) ni = (2) 
where p(#) is any function of class D’. 


Lemma 6. A set of tangential functions of class C"’ represents a secondary 
extremal for « =0. 


756 [July 
Thus 


1933] SUFFICIENT CONDITIONS IN THE CALCULUS OF VARIATIONS 757 


The one-parameter family 
a; = + ep(t)] L 8), 


where p(#) is any function of class C’’, is certainly a family of extremals, for 
its members are simply different representations of the same extremal g. 
Hence the variations 7;(#) of this family represent a secondary extremal. But 
for this family 


ni = (t)p(2) 


and so the lemma is proved. 
Such a secondary extremal we shall call éangential. 


Lemma7. A tangential secondary extremal (not the t-axis) vanishing at t™ and 
i is a characteristic solution for o =0. 


That such a secondary extremal satisfies (3.8) for o=0, (uw) =(0), follows 

from the relation 
oF 
£; (i, j ~ » 2). 

THEOREM 7. If there exist no negative characteristic roots, and no non- 
tangential characteristic solutions for o =0, there is no point on g conjugate to its 
initial point for o =0. 

In the first place, /@ cannot be conjugate to ¢ on g. For if it were, there 
would be a normal* secondary extremal (4) #(0) vanishing at ¢ and ¢.7 
This curve (4), with the set (~) =(0), would make J(n, u, 0) vanish. Now by 
Theorem 6, I(n, u, 0) is positive or zero for all admissible sets (u, ») and so 
(4) with the set (~) =(0) would minimize J(n, u, 0) among admissible sets 
(u, n). Hence (4) would have to satisfy conditions (3.8) and so be a charac- 
teristic solution for o=0. Since (#) is non-tangential, this is contrary to 
hypothesis. 

Next suppose that i+? were conjugate to ¢ on g. Then there would 
exist a normal secondary extremal (4) #(0) vanishing at ¢ and 7. The ex- 
pression I(n, u, 0) would be zero if evaluated along the broken secondary 
extremal (n) consisting of (4) in the interval ¢# and of the ¢-axis in the re- 
mainder of the interval #. The curve (7) would actually have a corner 
at #, because the only normal secondary extremal through a point on the 
t-axis in the direction of the t-axis is (n) =(0).f 


* A normal secondary extremal is one which satisfies the relation 
= 0 

T See Bliss, loc. cit., p. 200. 

t See Bliss, loc. cit., p. 199. 


1,--+,m) 


758 S. B. MYERS [July 


The arc () with the set (wv) =(0) would minimize J(n, u, 0) among ad- 
missible sets (w, n), and so would have to satisfy the corner conditions 
dw oF 


we 
= =0 =1,---,m). 
i- ( P ) 


Oni 


From this it would follow, due to the actual presence of a corner at /, that 

(5.6) [ast = k#0 (j=1,---,n). 
Hence 

(S.7) aj = — (8) G=1,---,m). 


But this is impossible; for along the normal secondary extremal (4) we have 


(5.8) = 0 G =1,---,m), 
and hence, by differentiation, 

(5.9) = 0 5M), 
which contradicts (5.7). 

Thus there is no point on g conjugate to /™. 

We come now to the final theorem. The arc g and the set (@) = (0) shall be 
said to furnish a proper, strong, relative minimum to J if there exist a neigh- 
borhood WN of g.and a neighborhood M of (a) =(0) such that the value of J 
is less when evaluated for g and (a) = (0) than when evaluated for any other 
admissible arc in N with ends determined by a set (a) in M. 


THEOREM 8. In order that the extremal g, without multiple points, and the set 
(a) =(0) afford a proper strong relative minimum to J it is sufficient that the 
transversality conditions (2.6) be satisfied, that the Legendre and Weierstrass 
sufficient conditions hold, that there be no negative characteristic roots, and that 
there be no characteristic solutions for o =0 except the tangential solutions van- 
ishing at both ends. 


Under the hypotheses of this theorem, Theorem 7 tells us that the end 
points of g are not conjugate, and so we can set up the function T(a, 0), 
and hence the quadratic form H(u, 0). According to Theorem 4, H(u, 0) is 
equal to J(n, u, 0), where (n) is any secondary extremal with ends determined 
by (u) through (3.9). By Theorem 6, H(u, 0) 20. 

Now if H(u, 0) were 0 for some (u) #(0), then J(n, u, 0) would be zero 
if evaluated for (uw) and any secondary extremal (4) with ends determined by 
(uw). Hence (4) would minimize I(n, u, 0) and so would satisfy (3.8) and be a 
characteristic solution for ¢=0 not vanishing at both ends. This contradicts 
the hypotheses. Thus H(u 0) is positive definite. 


1933] SUFFICIENT CONDITIONS IN THE CALCULUS OF VARIATIONS 759 


Now the Legendre and Weierstrass sufficient conditions are assumed to 
hold along g. Also, by Theorem 7, there is no point on g conjugate to its initial 
point. Hence g furnishes a minimum to J in the fixed end point problem. 
Furthermore, there exists a neighborhood WN of g such that if an extremal E 
determined by a set (a) lies in N, then, if (a) is sufficiently near (0), E will 
afford a minimum to J in the fixed end point problem, with respect to admis- 
sible arcs in N joining the end points of E.* 

Let g’ be any admissible arc in N, its end points being given by a certain 
set (a). Then if (a) is near enough to (0) the extremal determined by (a) 
will lie in V, and 


(5.10) J 2 J (a). 


But H(a, 0) gives the terms of second order in J(a), so that for (a) 
sufficiently near (0) we have 


(5.11) J(a) = J(0), 


the equality holding only if (a) = (0). 
Hence for g’ sufficiently near g and (a) sufficiently near (0), 


(5.12) 2 J(0). 


This inequality becomes an equality only if g’ is identical with g. 


Thus the theorem is proved. 
6. The fixed end point problem. This is the case that r=0 and @=con- 
stant, the end conditions being 


= constant = 1,2). 


The expression J(n, u, 0) is replaced by 


I(n, = 2 Qdt, 


0) 


and the accessory boundary problem has the form 


dT dQ 
“| | ; — 
dt oni Oni 


The necessary condition of Theorem 3 holds as stated. 
To prove Theorem 8 in the fixed end point case, we shall prove that 
under the hypotheses of the theorem there is no point on g conjugate to its 


* Cf. Morse, loc. cit., p.535, and Bliss, Annals of Mathematics, April, 1932, p. 267, Lemma 1. 


| 
i 
\ 
(j 
i 


760 S. B. MYERS 


initial point for ¢=0. This will follow if we can prove Theorem 7, which in 
turn is based on Theorem 6. 

The first two paragraphs in the proof of Theorem 6 hold as before. Next, 
Lemma 4 shows us that there is no point on g conjugate to its initial point for 
o =¢o, and Lemma 5 extends this property to values of o slightly greater than 
oo. Hence (n)=(0) furnishes a proper minimum to /(n, a) (see proof of Lem- 
ma 3) for these values of a, and so I(n, 7) > 0 for these values of o for (n) ¥ (0). 
This contradicts the hypothesis that a) was the least upper bound of the 
values of o for which J(n, ¢) is positive for all admissible sets (n) ¥(0). 
Theorem 6 follows, and hence Theorems 7 and 8. 


HARVARD UNIVERSITY, 
CAMBRIDGE, Mass. 


NOTES ON THE THEORY AND APPLICATION OF 
FOURIER TRANSFORMS. III, IV, V, VI, VII* 


BY 
R. E. A. C. PALEY anp N. WIENER 


III. On MUnTz’s THEOREM 


1. We give a proof of the following theorem which is Sz4sz’s{ generaliza- 
tion of Miintz’sf theorem. The method is similar to one employed by Carle- 
man§ to prove Miintz’s theorem. Incidentally we give a theorem concerning 
the distribution of the zeros of functions analytic in a half-plane, which is 
analogous to, and in some respects more general than, another theorem of 
Carleman’s paper. 

2. We recall the following well known 


THEOREM I. Let 
(2.1) fn = 0 S pn < 1 = 1, 2,--+) 


be a set of points in the unit circle |¢| <1. Suppose that an analytic function 
fil§) is regular in |¢| <1, satisfies 


where B is a constant which depends only on f,, and has zeros at the points §,. 
Then 
(2.3) — pr) < ©. 
n=1 
Conversely, if the series (2.3) converges, then there exists a bounded function 
hi(g) which has zeros at the points §,. 


Suppose we invert the interior of the unit circle into the half-plane 
3(z) >0, by means of the substitution 


* Presented to the Society, October 28, 1933; received by the editors March 16, 1933. Notes I 
and II of the Series have appeared in this volume of these Transactions, pp. 348-355. 

t O. Sz4sz, Uber die Approximation stetiger Funktionen durch lineare Aggregate von Potensen, 
Mathematische Annalen, vol. 77 (1916), pp. 482-496. 

tC. H. Miintz, Uber den Approximationssats von Weierstrass, Schwarz’s Festschrift, Berlin, 
1914, pp. 303-312. 

§ T. Carleman, Uber die Approximation analytischer Funktionen durch lineare Aggregate von 
vorgegebenen Potenzen, Arkiv for Matematik, Astronomi och Fysik, vol. 17 (1922-23), No. 9, pp. 1-30. 


762 R. E. A. C. PALEY AND NORBERT WIENER 
1+ iz 
1 — iz 


(2.4) 


Suppose that the point { =pe** inverts into z=x+iy. Then it is a matter of 
elementary algebra to verify that the convergence of the series (2.3) is equiva- 
lent to that of the series 

n=l 1 + =f + Ve 
2n=Xnt+tyn, being the inverse of [,=pne*. 

We are now in a position to prove our next theorem, which is a generaliza- 

tion of Carleman’s: 


THEOREM II. Let 
(2.5) Zn = Xn + (m = 1,2,---) 


denote a sequence of points in the half plane $(z)>0, and let f(z) be a function 
regular in $(z)>0O which satisfies 


(2.6) + iy) |*dx <1 


and has zeros at the points z,. Then 


2.7 < 
Conversely, if the series (2.7) converges, then there exists a bounded function 
I (2) regular in $(z) >0, satisfying (2.6) and having zeros at the points Zn. 
For the second half of the theorem we have only to observe, by Theorem 


I, that there exists a function g(z) with zeros at the points z,, analytic in 
$(z) >0, and less than 1 in absolute value. We have now only to write 


f(z) = + 


and condition (2.6) is satisfied. 
To prove the first part of the Theorem it will be sufficient to show that 
the function f(z) can be represented by a Cauchy integral 


f*(x’ 
(2.8) se) = Lav, > 0, 


where the function {*(x’) satisfies 


(2.9) f 4 f*(x’) < 1. 


1933] FOURIER TRANSFORMS 763 


Indeed, it is readily seen that the substitution (2.4) transforms the integral 
of the right-hand member of (2.8) into 


d 


where f;*(¢’) corresponds to f*(x’) and 


c= ay. 
+i 
It follows that, on putting f,(¢) =/(z), 
f | fi(pe**) < B 


and we may apply Theorem I. 
Now suppose that 


O<e<y<y, <M <N. 
Then 
2rifts) = f f 

+te—2z x’ +iy—32z 

————- dy —i dy’. 
« N+ iy’ « 


But 


| iy’—2| 


No 


2 Vo 1/2 
f f | + iy’) paw | — 
Né No 
and tends to zero as Np. Similar analysis applies to the last term. Hence 
2nif(z) = lim — an| f dx’ -f 
+te—2 x’ + ivyo— 2 
f(x’ + ie) Pf(x’ + iyo) 
+ie—2 + iyo — 2 


since the last two integrals converge absolutely. Now make yo tend to in- 
finity. The second integral does not exceed 


R. E. A. C. PALEY AND NORBERT WIENER 


and so tends to zero. 

Since condition (2.6) is satisfied, by a classical argument of F. Rieszf 
there exists a sequence {e,| 0} such that f(x+7e,) converges weakly to a 
function f*(x) satisfying (2.9). Hence 


2rif(z) = f dx’! — as 0, 


which is the desired result. 
3. We proceed now to our main theorem. 


THEOREM III. A necessary and sufficient condition for the closure L* of the 
functions t», R(dA,) > —}, in the interval (0, 1), is the divergence of the series 


1+ 
280) 


n=1 1+ |r. |? 


We observe first that if the functions #* are not closed L? then there 
exists a function ¢(¢) of integrable square on (0, 1) which is orthogonal to 
them all, so that we have 


(3.2) 0 < f lola 


1 
(3.3) f = 0 = 1, 2,3,---). 
0 


Conversely if the system {#*} is closed then there exists no function ¢(#) not 
identically zero satisfying (3.2) and (3.3). Now let us write ¢=e*; the con- 
ditions (3.2) and (3.3) become 


0 
0 <f | p(e*) < 


0 
f $(e?) exp [x(1 + An) ] dx = 0 


Upon writing ¢(e*)e*/? = &(x), we transform the last two formulas into 


t Untersuchungen iiber Systeme integrierbarer Funktionen, Mathematische Annalen, vol. 69 
(1910), pp. 449-497; pp. 466-468. 


764 [October 


1933) FOURIER TRANSFORMS 


0 
(3.4) 0< f | &(x) <0, 


0 
(3.5) f (x) exp [x(} + Xn) ]dx = 0 
Thus the closure or non-closure L? of the functions #* on (0, 1) is equivalent 
to the non-existence or existence of a function satisfying (3.4) and (3.5), 
which is equivalent to the closure or non-closure of the functions 
exp [x($+42,) ] on the interval (— ©, 0). 

4. Proceeding now to the proof of the theorem we observe first that, if 
there exists a function satisfying (3.4) and (3.5), then the function 


f(z) = 


exists for $(z) >0, and defines an analytic function in that half-plane. Fur- 
ther, by (3.5), it vanishes at the points (3+X,)i. Since, by Plancherel’s 
theorem, 


0 
f | f(x + iy) = f | |? 
the series (3.1) converges by Theorem II. Thus the non-closure of the 
functions {z*} implies the convergence of (3.1). 

To obtain the converse we have to show that when (3.1) converges then 
we can find a function ®(x) satisfying (3.4) and (3.5). Now in virtue of 
Theorem II we can find a function f(z) which is analytic for $(z) >0, is uni- 
formly bounded in this half-plane, and vanishes at the points (+,)é, with 
the integral 


| f(a + iy) |2dx 
uniformly bounded. Let g,() denote the Fourier transform 
+ iy)e**dx 
of f(x+zy). Then the argument given in detail in the first note of this series 


(Theorem II) shows that g,(é) is of the form G(t)e% for <0 and vanishes 
for §>0, where 


0 


Now 


765 


R. E. A. C. PALEY AND NORBERT WIENER 


(2m)-12 f G(é) exp [EG + 


0 
= f exp + exp [— #90.) 


= exp [— (y = RG +) 


= An) + + = + = 0. 
Thus, for all 


exp [ea +3.) at = 0. 


We have only to identify (x) with G(x) and our theorem is proved. 

5. The problem of the closure in L? of functions e on a finite interval 
is much more difficult than the corresponding one for an interval which is 
infinite in one direction. We have obtained a number of theorems in this 
direction but nothing like a complete answer to the problem. In the case how- 
ever where all the numbers X, are real (we need no longer assume that A, is 
positive or negative) a necessary and sufficient condition for the closure of 
the functions e** on a finite interval is the divergence of the series 

= 1 


awit 1 + | | 


IV. A THEOREM ON CLOSURE 
1. The present note is devoted to the proof of the following theorem: 
TueoreM I. (The set of functions is closed L? over (— ©, ©) 
when and only when 


= COs 


cosh 


The condition (1.1) should be contrasted with the condition 


> Sn) 
1 +| An? 
which is a necessary and sufficient condition for the closure L? of the func- 
tions {e-*!=!/%¢%==} on the interval (0, ©) (see e.g. the preceding note of this 
series). Thus if for instance all the numbers X, are real, the conditions for 
closure on the intervals (0, ©) and (—, ©) are 


0, — < < 


766 [October 


FOURIER TRANSFORMS 
1 


respectively. 
2. We shall prove the theorem by the following chain of lemmas: 


Lemma 1. As | x|—>00 in either direction along the real axis, 
| + 4) | ~ (2x) 


This is an immediate consequence of Stirling’s formula. 


Lemma 2. The set of functions {e-*\+\/2e%==} is closed L? when and only 
when the set {T(ix+})e=} is closed L?*. 


For let 


and 


f + $)e®*dx = 0 


g(x) = (ix + 
Then, by Lemma 1, 


(2.3) fl g(x) < 
and 


(2.4) f = 0 

Similarly, (2.1) and (2.2) follow from (2.3) and (2.4). Thus there exists a 
function f(x) orthogonal to the set {I'(ix+})e®=*} when and only when 
there exists a function g(x) orthogonal to the set {e-*!+!/e%==} and one set 
is closed L? when and only when the other set is closed. 


Lemma 3. The set of functions 
(2.5) exp (— 


is closed L? over (—«©, ©) when and only when the set of functions 
{T'\(ix+})e=} is closed L? over (— ©, 


Let 

(nm = 1,2,---) 


768 R. E. A. C. PALEY AND NORBERT WIENER [October 


This follows from the fact that the Fourier transform of (2.5), apart 
from a constant factor, is '(ix+})e**, and that, by Plancherel’s theorem, 
L?-closure is invariant under a Fourier transformation. 


Lemma 4. The set of functions {x}, u,=e—=—H, is closed L? over (0, 1) 
when and only when the set of functions (2.5) is closed L* over (—~, ~). 


For the pair of statements 


f tax < @ 
0 


= 0 


is equivalent to the pair of statements 


fl = 0 


f exp (— e?»)dv = 0, 


where 
g(v) = flexp (— Je"? exp (— */2), 
and we may again apply the argument of Lemma 2. 
Lemma 5. The set of functions {2}, R(un) > —}, is closed L? over (0, 1) 
when and only when 


(2.6) 


1 +| Hn |? 


This is a well known theorem of Sz4sz.t 
Condition (2.6) is equivalent to (1.1). Theorem I now follows by com- 
bining Lemmas 2, 3, 4, and 5. 


V. ON ENTIRE FUNCTIONS 


1. Let f(z) be an entire function, f(0) =1, and {z,} the sequence of zeros 
of f(z). We denote by M,(r) and m,(r) respectively the maximum and mini- 
mum of | f(z)| on the circle |z| =r, and by ;(r) the number of zeros of f(z) 


t Loc. cit. See also our preceding note. 


and 
and 


1933] FOURIER TRANSFORMS 769 


contained in the interior of this circle. The purpose of this note is to prove the 
following two theorems: 


THEOREM I. Let 
(1.1) log M;(r) = O(r'/?) 
and 


(1.2) f log + m;(r)r-3/*dr . 
0 


Then 
(1.3) ~ 
where the constant A is determined by 


(1.4) A=-r? ‘te Il (1 
0 


x 
) 
| z | 
THEOREM II. Let f(z) be an entire function of order not exceeding 3. If the 
conditions 
(1.5) ny(r) ~ Br'!?, 


(1.6) f() | = 1, 


are satisfied, then all roots of f(z) are positive. 


The proofs of these theorems are based upon a lemma which is of inde- 
pendent interest. This lemma is discussed in the next §2. In §3 we give proofs 
of Theorems I and II. In §4 we give a modification of the lemma of §2 and 
discuss its application to the theory of the Riemann zeta-function. In the 
last §5 we give proof of some results analogous to those of §2. They are in 
part contained in a paper by Titchmarsh, and in part represent extensions 
of his results. 

2. Let {A,} be a monotone sequence of positive numbers such that the 
series ).; Ay? converges. We set 


(2.1) =). 


v=] ? 


+ E. C. Titchmarsh, On integral functions with real negative zeros, Proceedings of the London 
Mathematical Society, (2), vol. 26 (1927), pp. 185-200. 


770 R. E. A. C. PALEY AND NORBERT WIENER 


Lemma 2.1. If >>>? converges then the statements 
(2.2) log ¢(iy) ~ |y| as | y| &, 


(2.3) f ¢(x)| = — 


are completely equivalent. 


We have, assuming y>0, 


(wy) log ¢(iy) = (ry)-! log (1 + 


= (ry)"! J “log (1 


where A(?) is the number of },’s not exceeding ¢. Similarly 


(2.4) 


2 
x? f log | ¢(x) | = — f f log} 1 — 
(2 5) -y 0 0 2 


x2 
= — 277? f dA(t) f log| 1 — — |a~*dx 
0 0 


y vit 
= — 2n*y"! f f log| 1 — s?| s~*ds. 
0 0 


Expressions (2.4), (2.5) are both of the form 


(2.6) 


where A(#) is a monotone increasing function. In (2.4) we have 


1 1 
(2.7) N(A) = = log +5). 


while, in (2.5), 


2 1/r 
N(A) = = — f log| 1 — x*| a-*dx 
9 


= 41 1 share 
= —4 lo —-—|+— lo 


(2.8) 


The function 4,(A) is positive and monotone decreasing, the same being 
also true of N2(A) since 


[October 
| 


1933] FOURIER TRANSFORMS 


og ||. 


(2.9) NiQ) = 


If we write N(A) for either of Ni(A), N2(A), the following properties are easily 
established: 


1 
O(log >) as A— 0, 


1 
o(—) ask—o, 
r2 


(2.11) _ max AN(A) < ©, N(A) > 0, 


(2.10) N(A) = 


1 


f N2(A)A“ddA = ———— f log la 
0 1) 0 1—A 


wit 
2 tan — 
2 2 


+ 1) ( + 1) 


It follows that 


(2.12) f N(A)A‘dd 0 when is real, 
0 


(2.13) = 1. 


We observe finally that the expressions 


~f = log | (x) | x~*dx 


—0 as y-0. Hence either of the statements 


(2.14) =f. +4 as y 0, i = 1,2, 


771 

— 


772 R. E. A. C. PALEY AND NORBERT WIENER 


implies the boundedness of the corresponding integral 


1 t 
f Ni 
vo 
over the range (0, ©). A direct application of a Tauberian theorem of Wienert 
shows that statements (2.14) are completely equivalent, which is precisely the 
result of our Lemma 2.1. 

3. We now proceed to the proofs of Theorems 1 and 2. 

Proof of Theorem I. We first observe that by a known theorem the as- 
sertions log M;(r) =O(r'*) and m;(r) =O(r') are equivalent. It follows that 
if we replace each zero by another zero with the same absolute value but 
situated on the positive part of the real axis, changing in effect 


we 


we certainly do not affect the truth of (1.1). Secondly this process will de- 
crease m,(r) for every value of r, so that we do not affect the truth of (1.2) 
either. Thus it is legitimate to assume that all the zeros of f(z) are real and 
positive. Let them be 


AZ, O<ASMS:::; dar? 
1 


Our next observation is that, by some theorems of Titchmarshf, for a 
function of this special type the assertions 


ny(r) ~ and log Mj(r) ~ wAr'/? 


are equivalent, so that we may leave m,(r) and confine our attention to 
M,(r). We write 
= w=u-+ iv 


so that f(z) is transformed into the new function 
2) = = 1 —_—_ — 
= ow) = (1-=) 
which satisfies 


(3.1) log* | ¢(w)| = O(| 
and 


t N. Wiener, Tauberian theorems, Annals of Mathematics, (2), vol. 33 (1932), pp. 1-100; Theo- 
rem XI’, p. 30. 
t Loc. cit., Theorems I and JI. 


[October 


1933] FOURIER TRANSFORMS 
(3.2) f logt | | u-*du << . 


We have to show that 
(3.3) log ¢(iv) ~ rA| 


Since the series }>;°A>? converges, by Lemma 2.1 it will be sufficient to 
establish the convergence of the integral 


(3.4) f | = — 


It follows from (3.1) that the ratio A(#)/t is bounded. Hence, by (2.5) 
and (2.9), 


f ‘Tog | udu = — = N2 (=)aaco 


v dt 1 + t/y dt 
= —1 = O(1). 


Being combined with (3.2) this shows that 
f "log" | (1) | udu = O(1). 
Hence the integral 
f | (1) | 


converges, whence, again by (3.2), the convergence of the integral (3.4) 
follows. Expression (1.2) for the constant A of Theorem I is now immediately 
obtained. 

Another, non-Tauberian, proof of Theorem 1 proceeds as follows. We 
have shown in the above discussion that the integral 


f | log | | | u-*du 


converges. Hence the harmonic function 


v log | | de! 


u)? + 2? 


1 
F(u, 0) = log | o(u + iv) | -— 


774 R. E. A. C. PALEY AND NORBERT WIENER [October 


exists in the upper half-plane »>0, and vanishes for v=0 (except at the 
zeros of ¢(w), u=X,"). It may be extended by reflection to the lower half- 
plane. The resulting harmonic function will be continuous everywhere, even 
at the zeros of ¢(w). Indeed, F(u, v) vanishes along the segment of the line 
v=0 through such a point, and thus cannot have a logarithmic singularity 
there, while the order of singularity cannot be greater than logarithmic. 
Thus F(u, v) is the real part of an entire function. 
By (1.1), 
log | ¢(z) | = O(| ). 


Now, 


, —2u 2u 


(u’ — u)?+ 2? —2u 2u 


2u u'v 


< 80 | log | | | + const 
= O(| 2|). 


Thus we must have 
F(u, v) = rAv, > 0, 


for some A. Thus, as »>&, 


1 , 
log ¢(iv) = log | | = rAv + — ~ du’ = rAv + o(2) 
v 


which is the desired result. 
Proof of Theorem II. If {z,} is the sequence of zeros of f(z) we have 


fe) = TE(1-=). 


¥(w) = f(w*) = 


f*(2) = Il 


o(w) = f*(z) = 


Then, since Av? < 


n;(r?) = np(r?) ~ log Mp(r?) = log Mg (r) = log G(ir). 


We set 
w 
(1-=), 
y=] Zy 
+4 
w 
( w?. 


1933] FOURIER TRANSFORMS 


Hence, by hypothesis (1.5) of Theorem II, 
log ¢(ir) ~ 7B, 
and, by Lemma 2.1, 


f “hel ¢(u) | = — 
By hypothesis (1.4), 
— B= f(x) | = f “log | ¥(u) | 
Thus we must have 
(3.5) J bee! ¥(u) | — log| | Ju-*du = 0. 


On the other hand, 
— u?| 


0, 


log | ¥(u) | — log| = log 


unless all roots z, are real and positive. Thus relation (3.5) implies that all 
roots 2, are positive, and Theorem II is proved. 

4. In this paragraph we use the notation of §2, but make a slightly dif- 
ferent assumption concerning the asymptotic behavior of ¢(7y). 


Lemma 4.1. If the series >-°;? converges, then the statements 
(4.1) log ¢(iy) ~ | y| log| y| as| y| > 
and 
(4.2) | ~ — log| y| 


are completely equivalent. 


We assume y>0O. Using the kernels Ni(A), N2(A) of §2 we may replace 
(4.1), (4.2) respectively by 


(4.3) (ylog 9) f = N,Q), N20). 


We now observe that either of the statements (4.3) implies} 
(4.4) . A(y) = O(y log y). 
Indeed if (4.3) is satisfied with N= N, or N=Ng, then 


715 

| 
{ 


R. E. A. C. PALEY AND NORBERT WIENER 


O(1) & (vlog aa(t) 


= N(1)(y log — (y log f 


> N(1)A(y)(y log 


since N(A) is positive and decreasing. Next we prove that, under the con- 
dition (4.4), (4.3) is equivalent to 


3 
(4.5) —A, N(d) = N,(d), N2(d), 
v0 


where 


(4.6) A*(y) = t)-"dA(t). 
0 


It is readily seen from (4.4) and (4.6) that A*(y) vanishes for sufficiently small 
y, while 
A*(y) = O(y) as 


Now the difference of the left-hand members of (4.5) and (4.3) is equal to 


17° (ty 1 1 
y Jo y/Nlogt logy 


— (y log log 2] 
=O { (log log =O (—) 


and —0 as y> or y—0. The same theorem of Wiener which was applied in 
the proof of Lemma 2.1 shows immediately the equivalence of the two state- 
ments (4.5); hence the two statements (4.3), and consequently (4.1) and 


(4.2), are also equivalent. 
In order to apply Lemma 4.1 to the theory of the Riemann zeta-function 


we introduce 


776 [October 


FOURIER TRANSFORMS 


2 
1 +2) + i) 
5° "As ‘ 
It is known that =(z) is an entire function, is even and has all zeros in the 


strip <4. Moreover 
log E(iy) = O(y) + log P(y/2) ~ 4y log 


% = % + ise, 27 >0,| | <4; 
Let us put 
2? 
H(z) = ell 1 -=). 
We have outside the strip | $(z)| <1, 


og |= log 


22 — 


E(z) A? | [2s] 


| 


Thus, assuming y>0, 


log H(iy) ~ }y log y, 
and, by Lemma 4.1, 


y 
(4.8) (log f log | cH(«) | a-%dx  — asy— oo, 
-y 


Again, on the real axis, 


whence 
log(|1 — x*/z2 — |) 


Furthermore 


zs) “\ ¥ 
We set 
| 
| 
| 
x? x? 
1 


R. E. A. C. PALEY AND NORBERT WIENER 


x? 


2 
f log t-*dt 
Jo 1-f 


hence we can integrate term-wise and obtain 


0< |” dx= DI,<o. 


x 


Then, by (4.8), 


(log f tog | | =. 


If we return to the zeta-function using (4.7), this gives our final result 


(4.9) log | o(log 4). 


x 


5. Titchmarsh hast discussed asymptotic properties of entire functions 
with real negative zeros. In this paragraph we indicate some results which 
overlap with those of Titchmarsh. The method used in deriving these results 
is closely analogous to that used in proving Lemma 2.1; therefore we shall 
give here only a brief outline of the proof leaving the details to the reader. 

Let 

oe) = 
a, 
be an entire function all of whose zeros { —a,} are negative. It will be as- 
sumed that 


(5.1) O<aSaS:--, 


For simplicity we shall use the symbol n(r) instead of m,(r) of the preceding 
paragraphs. The letter x will designate a real positive variable which tends 
to infinity. 


t Loc. cit. 


778 [October 
| 
flog 
x? 
? 
-°(:;) 
=O 
2 


1933] FOURIER TRANSFORMS 


Lemma 5.1. Let X, p, 0 be fixed numbers such that 
(5.2) A>0, O<p<i1, 
Then the statements 
(i) n(x) ~dzx?, 
(ii) log f(x) ~7d cosec mp x?, 
(iii) log | f(xe**) | cosec mp Cos Op 


cosec mp COs Op 
(iv) f log | f(re'*) | dems = (20) 
0 


are all equivalent. In the last statement (iv) the right-hand member in the case 
6=7/(20) should be replaced by its limiting value as p—71/(26). 


We first observe that the convergence of the series >.>; a7! implies 
(5.3) n(x) = o(x). 
Next let us put 
(5.4) w(x) = x-?n(x). 


In view of the fact that m(x) is monotone increasing it is readily seen that the 
statements (i), which can be written as 


(5.5) w(x) d, 
and 
(5.6) f w(r)dr ~ dx, 
0 
are equivalent.f 
Our next step is to transform the left-hand members of (ii-iv) in such a 


way as to allow an immediate application of Wiener’s Tauberian theorems. 
We have 


log f(x) 


x 


Tt This is readily proved directly or derived from a theorem of Wiener, loc. cit., Theorem 
XIII, pp. 34-35; it also follows from a well known theorem of Landau, Beitrdge zuranalytischen 
Zahlentheorie, Rendiconti del Circolo Matematico di Palermo, vol. 26 (1908), pp. 169-302; p. 218. 


779 
| 
| 
4 
#\e-} 
(=) 
- 
x 0 t H 
1 
ad 
| 
4 


780 R. E. A. C. PALEY AND NORBERT WIENER [October 


2 
dn(t) 


1 
log | f(xei*) | = log | 1 
0 


t 
1 +—cos@ 
x 


1? 
=— f dt, 
x 0 x 2t ad 


1+—cosé + — 
x x? 


f (29) log | f(re'®) | dr 
0 


t 
1 +—cos@ 
r 


z t 
0 0 r 2t 2 


1 + —cos@+— 
r r? 


1 
1 + —cosé 


1 x (20) 
=—— f w(t)dt (=) f (20) 
x Jo t z/t 1 


1 + — cos 6 + — 
r r? 


(see (5.9) below, for 09 =7/2). Thus all the statements (ii-iv) are 
expressible in the form 


where N(y) stands, respectively, for 


N = 
a(y) cosec mp 1 + y 


1 1+ ycosé 
Nily) = 
cosec mp cos Op 1+2ycos@+ y? 


i+ 1 
— cos 
Ns(y) 20 ye (20) f (20) dy, 
Cosec mp COS Op 1 


/y 2 
1+—cosé@ + — 
r r? 


A direct computation yields 


FOURIER TRANSFORMS 


dy = m cosec + p), 
J, 1+y 


+ y cos 
o 1+ 2ycos@+ y? 


5.9 =— f ————dy + f ————d | 
2 [ 1+ 0 1+ 


= f er + dy 
0 


= m cosec + p) cos O(iu + p), 


1 
1 +—cos@ 
r 


f (20) dy f (20) 
0 Vy 


2 1 
1 +—cos@ + — 
r r? 


1+ 7rcos0 
f (28) ar f (20 dy 
0 1 + 27 cos + 


cosec + p) cos 0(iu + p) 


The last result is first derived in the case where 0 <p <1+7/(26), but is 
easily extended to the general case 0<p<1 by analytic continuation. It is 
an easy matter to verify that the kernels Ns(y), N«(y) when |6| <7/2, and 
N;(y) are possessed of all the properties of the kernel N(y) stated in the 
proof of Lemma 2.1. We set A(#) = JS. w(t)dt. Since w(#) =0, A(é) is monotone 
increasing. Hence Wiener’s theorem used in §2 may be applied here with the 
result that the statements (ii), (iii) when || <7/2, and (iv) are equivalent, 
while either of (ii) or (iv) implies (iii) when 7/2 <|6| <7. It should be ob- 
served that the kernel N,(y) is not positive when |6| >2/2, while N;(y) is 
positive over the whole range |6| <z. The introduction of this kernel was 
necessitated by the lack of positiveness of N,(y) when |6| >a/2. Another 
theorem of Wienert will show that either of the statements (ii), (iii) when 
|@| <2/2, and (iv) implies (5.6), hence (5.5) which is the same as (i). On the 
other hand it may be proved directly{ that (i) implies (ii), hence also (iii) 
and (iv). This completes the proof of Lemma 5.1. 


+ N. Wiener, loc. cit., Theorem XI’, pp. 31-32. 
¢ Titchmarsh, loc. cit., Theorem I. 


1933] 781 
} 
| 
| 
i 
| 
| 
| 
{ 


R. E. A. C. PALEY AND NORBERT WIENER 


VI. ON TWO PROBLEMS OF P6LYA 


1. Pélyaf has set the following problem: Let the real numbers m, ma, - - - 
have the properties 0<m,<m2< -- + and 


(1.1) 


Furthermore, let f(x) be continuous in the closed interval [a, b]. Then it will 
follow from 


(1.2) fiw COS m,xdx = sin m,xdx = 0 


that f(x) vanishes identically. 
There is no restriction in supposing b= —a=7. We shall prove the follow- 


ing more general theorem: 
THEOREM I. Let 0<m<m:< - - - and let 


(1.3) lim > 1. 


My, 


Then if f(x) belongs to L? and 
(1.4) f = 0 (nm = 1, 2, 3,---), 


f(x) vanishes except over a set of measure zero. 


It is very important that we have replaced lim by lim. This yields us a 
much deeper theorem. 

Since (1.4) is satisfied with f(x) replaced by f(x) +f(—<), it suffices to con- 
sider only the cases of f(x) even or odd. We shall give the discussion of the 
case f(x) even, under the additional assumption that /*,f(é)dt+¥0. The case 
where this assumption is not satisfied as well as the case of f(x) odd will re- 
quire but slight modifications which may be left to the reader. We set 


(1.5) ow) = f 


where the entire function ¢(u) is even and where we may assume without loss 
of generality that ¢(0) =1. We observe that, on setting u=0+ir, we have 


t Jahresbericht der Deutschen Mathematiker-Vereinigung, vol. 40 (1931), Abteilung 2, p. 81, 
Problem 108. 


782 [October 
n b-—a 
My 


1933] FOURIER TRANSFORMS 
1/2 
(1.6) | (4) | = | o(o + ir) | {f | f(t) jas etirl = O(er!*!), 


On the other hand by the theory of Fourier transforms we know that ¢(¢) ¢ L? 
over (—%, ©) and that the Fourier transform of ¢(c) vanishes outside 
(—7, m). Hence by Theorem II of our Note I,+ 


(1.7) f He 


By the change of variable u? =z we obtain a function ¥(z) =¢(u) which satis- 
fies the conditions of Theorem I of our preceding Note V. It follows at once 
that the limit 


(1.8) tim 


Sow 
exists. Let {u,} be the sequence of zeros of $(u). It is clear that {+m,} isa 
subsequence of {u,}. Hence, by (1.3), 


(1.9) 


However, by Jensen’s theorem, in view of (1.6), 


=f" = — f "to | | do 
(1.10) t o 


1 ae 1 1 
<—] -r| sin do + o(—) =2+ o(-). 
2rrJ o r r 
Hence 
1 t 
(1.11) som Mags 
fo fF 0 t 

The resulting contradiction shows that f(x) must vanish except for a set of 
measure zero. 

2. Pélyat has also set the following problem: Let f(z) be an entire function 
which is bounded for the integral arguments z=0, +1, +2,---, +n,---. Let 


(2.1) M,(r) = o(r). 
Then f(z) reduces to a constant. 


¢ The present volume of these Transactions, p. 349. 
t Jahresbericht der Deutschen Mathematiker-Vereinigung, vol. 40 (1931), Abteilung 2, p. 80, 
Problem 105. 


| 
| 
| 


784 R. E, A. C. PALEY AND NORBERT WIENER [October 


It is clearly sufficient to prove the theorem for an even f(z), for if f(z) is 
odd, we need only consider f(z)/z, which will be even, and will hence reduce 
to a constant which can only be zero. The general function may then be 
treated by reducing it to the sum of an odd and an even part. 

If f(z) is even, 

(2.2) g(z) = [f(z) — 


will be entire. Thus 


Let us form 
sin — 2) 


(2.4) G(z) = ———_—— 
a(n — 2) 


Clearly 
(2.5) G(x + iy) = O(ye*!¥!). 
Let us now form the entire function 
(2.6) H(z) = [g(z) — G(z)] cosec xz. 
For all values of z and all integral values of ” we shall have 
+ 4) + iy] = Ofexp + 4)? + — y|)} 
(2.7) + 0(1/|y|) 
= + O(1/| 
uniformly in m. We have here employed (2.1) and (2.5). Hence 


(2.8) fi H(n + + iy) |*dy = 


for all e. 
Let us put 


= [x +4], x2 = [x — 4]. 
Then, by Cauchy’s theorem, 
H 
+ iy) = f 
in — iy 
(2.9) 


dy; 


Hence 


1933] FOURIER TRANSFORMS 
(2.10) f H(x + iy)e“dy 


1 iu 


and, by the Plancherel theorem and the Schwarz unequality, 


fi H(x + iy) < const { fl + 3 + iy) |*dy 


H(x2— 43+ iy) Pay 
Thus, by (2.8), 
By an application of Cauchy’s theorem, 
Thus by the Plancherel theorem, 
(2.13) f "| |2e-2"*du = 


This is however only possible if ¢(w) vanishes almost everywhere for | u| >e. 
Since ¢ is arbitrarily small, ¢(~) must be equivalent to zero. Thus H(z) 
vanishes, and g(z) =G(z). On the other hand, 


(2.14) Mg(r) ~ =r, 
unless every g(m) is zero. This yields 
(2.15) G(z) = g(z) = 0, f(z) = f(0), 


which is the desired result. 


VII. ON THE VOLTERRA EQUATION 
1. A theorem of Mercerf asserts that if 0<a<1, and if 


t J. Mercer, On the limits of real variants, Proceedings of the London Mathematical Society, (2), 
vol. 5 (1907), pp. 206-224. 


| 

| 

| 

| 

| 
| 


786 R. E. A. C. PALEY AND NORBERT WIENER 
(1.1) on, + (1 +s, 
1 
then 
(1.2) Sas. 
This theorem possesses generalizations of a non-trivial nature. The continu- 
ous analogue asserts that if 0<a<1 and if 
(1.3) as(x) as ©, 
then 
(1.4) s(x) s. 
By a change of independent variable, this asserts that if 


3 
(1.5) + (1 — a) f — s, 


then 
(1.6) S(i)— s. 
This statement is a particular case of the following theorem: 


TueoreM I. Let F(x) be measurable and bounded over every finite range 
(0, A). Let K(x) cL over (0, ©), that is, 


(1.7) lx@la<e. 

Let 

(1.8) F(s) + DF 08 
Then if 

(1.9)  - 1, 2 0, 


we shall have 
(1.10) s[1 + f 
0 


Conversely, let K(x) cL, —1, amd let (1.8) imply (1.10) for every 
F(x) satisfying our conditions. Then (1.9) must be true. 


1933} FOURIER TRANSFORMS 787 


2. The second part of this theorem may be proved by reductio ad ab- 
surdum by putting 
F(x) = e%o*, 
where 
f dé = — i, R( wo) = 0, Wo 0. 
0 


Then 


Fe) + f K(e =| en f 
0 z 


< f lx@laso. 


As (1.10) is obviously false, the second part of the theorem is proved. 

3. The first part of Theorem I will appear as a corollary to a theorem con- 
cerning the Volterra integral equation of the closed cycle. 

We shall use the symbol 


A B(x) 

to indicate the “Faltung” of the two functions A(x), B(x), 

It is well known that the (bounded and measurable) solution of the Volterra 
integral equation 
(3.1) G(x) = F(x) + K & F(x) 
is uniquely determined and given by 
(3.2) F(x) = G(x) + Q & G(x) 
where the resolvent kernel Q(x) itself is determined from 
(3.3) Q(x) + K(x) = — K Q(x) = K(x), 


or else, by 
(3.4) Q(x) = 1)*K*(2), Ka) = Ke 


We observe that the solution of (3.3) is easily obtained by using Laplace 
transforms.t Let us designate by 


T See, e.g., S. Bochner, Vorlesungen siber Fouriersche Integrale, Leipzig, 1932, chapter VII. 
Other references are also found there. 


788 R. E. A. C. PALEY AND NORBERT WIENER 


(3.5) nw) = f (ede, 
0 


(3.6) aw) = f 


the Laplace transforms of K(x), Q(x). Equation (3.4) reduces then to 
— k(w) 


(3.7) q(w) = 1+ Ku)’ 


and Q(x) will be found by inversion of a Laplace integral. 
The theorem in question may now be stated as follows: 


THEOREM II. A necessary and sufficient condition that Q(x) ¢ L over (0, ~), 
that is, 


(3.8) f lo@la<e, 
ts that 
(3.9) = f — 1, R(w) =O. 


If this theorem holds true, the first part of Theorem I is immediately 
derived. Indeed, under the assumptions made we have 


F(x) = G(x) + f “G(x — HOA. 


Here G(£) is bounded over every finite range and —s as >. Hence G(é) 
is bounded over the whole range (0, ©). Since Q(£) is integrable over (0, ©) 
we may pass to limit as x under the integral sign, with the result 


Fa)osts f = s[1 + 9(0)] = s[1 + 


which is precisely the desired formula (1.10). 

4. To prove the necessity of (3.9) we observe that if (3.8) holds then 
q(w) as well as k(w) are analytic in the half-plane R(w) >0 and continuous 
up to and including the boundary R(w) =0. This implies that the denomi- 
nator in the right-hand member of (3.7) does not vanish for R(w) 20, so that 
(3.9) holds. 

The proof that (3.9) is sufficient is more difficult. We introduce the auxili- 
ary functions 


[October 


FOURIER TRANSFORMS 


1, | u| <A, 
0, |u| > 24, 
and put 
— k(w) 
(4.2) q*(w) = T+ ku)’ 
(4.3) q*(iu) = qi(u) + g2(), 
(4.4) qi(u) = ba(u)g*(iu), g2(u) = [1 — da(u)]q*(u). 


We wish to prove that if A is sufficiently large, gi(u) and q2(u) are both Fourier 
transforms of functions of L. 
To begin with, 
ga(u) k( in) 
qi(u) = + 
0 when | «| = 24. 


when | «| < 24, 


Thus g:(%) is the quotient of two functions, each the Fourier transform of a 
function of L, each vanishing outside a finite range, and such that the de- 
nominator function only vanishes in points interior to the region of vanishing 
of the numerator function. We may then appeal to a theory due to Wiener 
to show that q;(u) is the Fourier transform of a function of L. 

We have 


— k(iu)[1 — 
1+ k(iu)[1 — o4/2(u)] 


It is easy to show that this is the Fourier transform of a function of L when 
the same is true of 


— k(in)[1 — gaya(u)]{1 + [1 — 


q2(u) = [1 — ga(u)] 


Now 
{ [1 — 


is the Fourier transform of a function h,(x) for which 


1 N. Wiener, The Fourier Integral and Certain of its Applications, Cambridge, 1933; Lemmas 67, 
ro, Ors. 


1933} 789 
n=1 


790 R. E. A. C. PALEY AND NORBERT WIENER 


f | dé | f | nce | ae] 
A n 
cos—(é — — cos A(t — | 


f — f K(n) ay 


An argument of the familiar Fejér type will show that we may choose A so 
large that the integral in brackets is less than any given number A, 0<A <1. 
It will follow at once that g2(u) is the Fourier transform of a function F2(x) 
for which 


f | Fa(é)| —— 


Combining this with the similar result for g,(u), we see that we may write 
(4.5) f € L 


We may rewrite (4.5) as 


(4.6) f dé = — f F(é)e~**dé + g*(iu). 
0 


Now, it is readily seen that k(w)—0 as | w|->©, uniformly in the half- 
plane R(w) =0. Since, by hypothesis, 1+(w)+0 for R(w) 20, there exists 
a positive constant ¢ such that 


k(w)|>c>0. 
Thus 
0 


is a function of w analytic and bounded in the right half-plane, and continu- 
ous up to and including the imaginary axis. Similarly, 


0 
f 
is a function of w analytic and bounded in the left half-plane, and continuous 
up to and including the imaginary axis. Furthermore, the two functions are 
identical on the imaginary axis. By the classical argument of Riemann- 
Painlevé it readily results that they are parts of the same analytic function, 
which is thus entire and bounded. It hence reduces to a constant, and since 


[October 


1933] FOURIER TRANSFORMS 


0 
(4.7) f F(é)e*tdt 0 as — @, 


this constant can only be 0. Thus 


(4.8) q*(w) = = 
1 + k(w) 0 
On the other hand, it follows readily from (3.4) that there exists a wo>0 
such that 
— k(w) 


(4.9) f > wo, 


and 
4.10 — wed 
(4.10) | | < 0 


By the uniqueness theorem for Laplace transforms we conclude that F(x)e-”* 
and Q(x)e-”* coincide almost everywhere, whence Q(x) ¢ L. 

5. In this proof, we have used the theorem{ of Wiener that if a function 
has an absolutely convergent Fourier series and does not vanish, its reciprocal 
has an absolutely convergent Fourier series. P. Lévyt has pointed out that 
the same methods suffice for the following theorem: if a function f(x) has an 
absolutely convergent Fourier series, and ®(u) is analytic over the range of 
values of f(x), then ®[f(x)] has an absolutely convergent Fourier series. By 
methods not essentially different from those of this paper, we may extend 
this theorem as follows: if f(x) is the Fourier transform of a function of L, 
and (x) is analytic over the range of values of f(x), including 0, then 


is the Fourier transform of a function of L. 


t Loc. cit., Lemma 6i¢. 
1 P. Lévy, Sur la convergence absolue des séries de Fourier, Paris Comptes Rendus, vol. 196 
(1933), p. 463. 


MASSACHUSETTS INSTITUTE OF TECHNOLOGY, 
CAMBRIDGE, Mass, 


791 


INFINITE SYSTEMS OF ORDINARY DIFFERENTIAL 
EQUATIONS WITH APPLICATIONS TO CERTAIN 
SECOND-ORDER PARTIAL DIFFERENTIAL 
EQUATIONS* 


BY 
DANIEL C. LEWIS, JR. 
INTRODUCTION 


From a purely formal point of view, the problem of integrating the non- 
linear partial differential equation 


ou 
ot ay? Oy oa 
under the conditions (0, #) =u(z, #) =0, u(y, 0) =f(y), u:(y, 0) =g(y) (where 


f and g are prescribed functions) can be reduced in the following way to the 
problem of integrating an infinite system of ordinary differential equations, 


ad? x, 
dt? 


dx, dx2 
+ = fr (n = 1,2,3,---+). 
We want the solution to be valid in a rectangular region, OS yz, 
0s#sK>0. We assume the trigonometric developments 


f(y) = Larsin ky, g(y) = sin ky, u(y, = sin ky, 
k=l k=l k=l 


where the a, and a/ are known constants and the x;,(#) are unknown func- 
tions. If we formally differentiate the series for u(y,¢), substitute in the partial 
differential equation, multiply through by (2/7) sin my, and integrate with 
respect to y from 0 to 7, making use of the orthogonal properties of the sine 
functions, we get the mth equation of the infinite system written above with 


f [ dx, 


k=l 


cosky, >> — sin ky, Dox sin ky, y, sin my dy. 
Jo k=l dt k=l 


* Presented to the Society, December 27, 1932; received by the editors September 20, 1932, and 
in revised form and with addition of Part III, Feburary 7, 1933. 


792 


INFINITE SYSTEMS OF DIFFERENTIAL EQUATIONS 793 


We must evidently solve our infinite system under the initial conditions 
x,(0) dx,/dt| =a; 
It is the principal object of this paper to put the above formal procedure 
upon a rigorous basis. 
In Part I, we shall study the slightly more general system 
Xin = Jn » 41, 
the u, being arbitrary positive constants, together with the initial conditions 
_ given above. Actually we shall study this system in the equivalent integral 
form, 


= On COS pnt + (Gn SiN pnt 


+ (1/pn) E x1(r), sin — r)dr. 


In Part II, we shall apply the results of Part I to partial differential 
equations, thus obtaining an existence theorem. 

This plan has already been carried out by L. Lichtenstein* for equations 
of considerably more restricted type. The right hand side of Lichtenstein’s 
equation is, in fact, independent of du/dy and du/dt and can be developed 
in a power series in u: 

Ou ou > 
On the other hand, the essential requirement laid down by us is that F should 
obey a certain Lipschitz condition in its first three arguments. The present 
results also represent a generalization beyond Lichtenstein’s work in that 
the requirements on the initial values, f(y) and g(y), are much less restric- 
tive. Here it is merely assumed that f’(y) and g(y) have summable squares 
on 0<y<r7, or in other words that and converge; whereas 
Lichtenstein assumes the convergence of k?| and |. The general- 
izations that Lichtenstein does carry through in other directions (as to the 
shape of the region and the nature of the end or “boundary” conditions) can 
equally well be carried out here. 

On the other hand our generalizations are gained at a certain sacrifice. 
The solution u(y, ¢) produced by Lichtenstein is a solution in the ordinary 
sense, whereas the «(y, ¢) produced by us may be a solution only in a certain 
generalized sense to be defined later. This generalized notion of a solution of 
a partial differential equation is, however, a natural one, and has been used 


* See bibliography at the end of this introduction. 


794 D. C. LEWIS [October 


by other authors. N. Wiener,* for example, has given a generalization, which, 
while not assuming the existence of the first derivatives, du/dy and du/dt, 
applies only to linear equations. He gives references to Bécher and G. C. 
Evans. My own definition requires the existence of the first derivatives, but, 
so far as I know, it is the only one which applies to the general second-order 
partial differential equation, linear or not. 

A bibliography of the literature on infinite systems of differential equa- 
tions appears at the end of this introduction. This bibliography is complete 
so far as I have been able to ascertain. None of the work there listed, with 
the exception of Lichtenstein’s and Siddiqi’s, can be applied here. The reason 
is that the usual existence theorems for infinite systems of differential equa- 
tions of the form dz,/dt={,(t, z:, 22, with initial conditions z,(0) =c,, 
assume a too restrictive correspondence between the laws of decrease of 
the |z.—c,| and the |¢,|. This correspondence is roughly of the nature 
that the convergence of }>:|2z.—cx|? implies the convergence of >>x|¢x|? 
for ¢ suitably restricted. Evidently such an assumption fails to take into 
account even the following highly degenerate example which can be inte- 
grated immediately: 


dt 
dt 
Assume the convergence of >>,c2. Then the convergence of >>.22 would 
ensure the convergence of )>;|2.—cx|?, but not that of }>.¢2. Nevertheless 
such an infinite system is extremely useful in the applications to partial 
differential equations. This particular simple system is included in the 
theories presented both by Lichtenstein and by me. For it may be written 
in the form 


= 21, Za,° °° 2x(0) = Cky 


= — NZn—1 = font, 21, Z2,° ) k = 1, 2,3,---). 


xX», 0 
= 0; 
dt? 
if we set %,=Zen-1, 
dxn 
= NZan.- 
dt 


But it can be easily shown that a large field still awaits exploration. 

The infinite systems considered in this paper are formally quite like those 
treated by Lichtenstein and quite unlike those treated by W. L. Hart in his 
paper of 1922. Nevertheless the methods are much more similar to Hart’s 


* Mathematische Annalen, vol. 95 (1926), p. 582. 


1933] INFINITE SYSTEMS OF DIFFERENTIAL EQUATIONS 795 


methods than to Lichtenstein’s; and the author wishes to acknowledge here 
his less obvious debt to Hart. 

The application of the results of Part I are probably not limited to the 
problems considered in Part II. Instead of using the trigonometric expansions, 
exclusively considered in Part II, one might use general Sturm-Liouville 
orthogonal functions. Such a procedure might furnish theories for non-linear 
normal hyperbolic equations (in any number of independent variables) with 
boundary conditions of a much more complicated type than those considered 
here. Here also is a large field awaiting exploration. 

Existence theorems for the Cauchy problem with non-analytic initial 
conditions have not yet been given for general non-linear* hyperbolic 
equations, except for the case of two independent variables, which has been 
most elegantly treated by H. Lewy.{ It may be that the method of infinite 
systems of ordinary differential equations will furnish the key to the prob- 
lem. Even in the case of two independent variables Lewy’s work is ap- 
plicable only to the unmixed Cauchy problem, whereas this method is 
applicable to the mixed problem, where boundary conditions as well as initial 
conditions play a prominent réle. Further developments await more general 
existence theorems for infinite systems of differential equations. 


BIBLIOGRAPHY OF THE THEORY OF INFINITE SYSTEMS OF DIFFERENTIAL 
EQUATIONS 


H. von Koch, Sur les systémes d’ordre infini d’équations différentielles, 
Ofversigt af Kongliga Vetenskaps-Akademiens Férhandlingar, vol. 56 (1899), 
pp. 395-411 (analytic non-linear theory). 

E. H. Moore, New Haven Mathematical Colloquium, 1906 (a linear theory 
in the sense of “general analysis”). 

F. R. Moulton, Solution of an infinite system of differential equations of 
the analytic type, Proceedings of the National Academy of Sciences, vol. 1 
(1915), pp. 350-354 (analytic non-linear theory). The same work is published 
in the text-book on differential equations by the same author. 

T. H. Hildebrandt, On a theory of linear differential equations in general 
analysis, these Transactions, vol. 18 (1917), pp. 73-96. 

W. L. Hart, Differential equations and implicit functions in infinitely many 


* Riemann and Hadamard have laid the foundation for the linear case. See the latter’s book 
Lectures on Cauchy’s Problem. 

See also M. Mathisson, Eine neue Lisungsmethode fir Differentialgleichungen von normalen hyper- 
bolischen Typus, Mathematische Annalen, vol. 107 (1932), pp. 400-419. 

t Uber das Anfangswertproblem einer hyperbolischen nichtlinearen partiellen Differentialgleichung 
sweiter Ordnung mit zwei unabhingigen V erinderlichen, Mathematische Annalen, vol. 98 (1927), pp. 
179-191. 


796 D. C. LEWIS [October 


variables, these Transactions, vol. 18 (1917), pp. 125-160; Functions of in- 
finitely many variables in Hilbert space, these Transactions, vol. 23 (1922), 
pp. 30-50 (non-analytic non-linear theory); Linear differential equations in 
infinitely many variables, American Journal of Mathematics, vol. 39 (1917), 
pp. 407-424; The Cauchy-Lipschiiz method for infinite systems of differential 
equations, American Journal of Mathematics, vol. 43 (1921), pp. 226-231. 

I. A. Barnett, Differential equations with a continuous infinitude of va- 
riables, American Journal of Mathematics, vol. 44 (1922), pp. 172-190; 
Linear partial differential equations with a continuous infinitude of variables, 
American Journal of Mathematics, vol. 45 (1923), pp. 42-53. 

A. Wintner, Zur Theorie der unendlichen Differentialsysteme, Mathe- 
matische Annalen, vol. 95 (1925), pp. 544-556; Zur Losung von Differential- 
systemen mit unendlich vielen Verdnderlichen, Mathematische Annalen, vol. 
98 (1927), pp. 273-280 (analytic non-linear theory) ; Zur Analysis im Hilbert- 
schen Raume, Mathematische Zeitschrift, vol. 28 (1928), pp. 451-470; Upon 
a theory of infinite systems of non-linear implicit and differential equations, 
American Journal of Mathematics, vol. 53 (1931), pp. 241-257. 

L. Lichtenstein, Zur Theorie partieller Differentialgleichungen zweiter 
Ordnung vom hyperbolischen Typus, Journal fiir die reine und angewandte 
Mathematik, vol. 158 (1927), pp. 80-91. 

W. T. Reid, Properties of solutions of an infinite system of ordinary linear 
differential equations of the first order with auxiliary boundary conditions, 
these Transactions, vol. 32 (1930), pp. 284-318. 

M. R. Siddiqi, Zur Theorie der nichtlinearen partiellen Differentialgleich- 
ungen vom parabolischen Typus, Mathematische Zeitschrift, vol. 35 (1932), 
pp. 464484. 

In addition to the above papers there is also an extensive literature deal- 
ing with the single differential equation of infinite order in one unknown 
function. This theory is closely connected with the Heaviside operational 
calculus and has little or nothing in common with the theory of infinite 
systems of differential equations. The device whereby the differential equa- 


tion 
dx d"x 
)=0 
dt di? dt” 
can be put into the form of a system of equations 


dx. 


by setting 


| 
= 
dt*- 


1933] INFINITE SYSTEMS OF DIFFERENTIAL EQUATIONS 797 


apparently fails when n=. The bibliography of the single equation of 
infinite order may be found in a footnote to a paper by H. T. Davis in the 
Annals of Mathematics, (2), vol. 32 (1931), pp. 686-714. It mentions the 
following authors: Bourlet, Bromwich, von Koch, Pincherle, Ritt, Schiirer, 
Scheffer, Valiron, Wiener. 


Part I 


1. Notation, terminology, definitions, and lemmas. We consider in- 
finite systems of equations of the form, 


(1.1) = -+ a(r)} sin — r)dr (k = 1, 2,---). 
O 


The are the unknown functions. 

The yu, are any positive numbers. 

is an abbreviation for a; cos /ux) sin uxt, where a, and a; 
are for the present completely arbitrary, except that, in common with all 
other numbers arising in this paper, they are real. 

fx{r, x(7)} is a function depending upon k, 1, x:(r), (d/dr) x:(r), x2(r), 
(d/dr) x2(r),---. 

In general, an italic letter followed by {t, x} will be an abbreviation for 
a function dependent upon the infinitely many independent variables, ¢, x1, 
Xi , Xe, Xf , Xz, xf, ~~~ . On the other hand, a Greek letter, with a superscript 
n, followed by {t, x} will indicate a function of the first 2n+1 of these 
variables 

Thus F {t, x} depends upon #, 1, x1 , x2, x7, ---, while y {t,x} depends 
upon %1, only. 

By a “point” in “function space” we shall mean an infinite sequence of 
numbers, called “coordinates.” We shall deal with two types of function 
space: 

In considering type 1, the mth coordinate of a point will usually be denoted 
by a letter with the subscript , e.g. x,. A point in function space of type 1, 
whose coordinates are represented by 21, x2, %3, - - - , will be denoted briefly 
by [x]. 

In dealing with type 2, the mth coordinate will be denoted by a letter 
unprimed with the subscript 3(”+1), if ” is odd, and primed with the sub- 
script 4, if m is even. Thus the symbols %, xi, x2, x7, %3, XJ, --- may be 
taken to represent in the proper order the coordinates of a point in space of 
type 2. Such a point with coordinates represented by these symbols is denoted 
by (x). Here we use parentheses instead of the square brackets reserved for 
points of function space of type 1. 


798 D. C. LEWIS [October 


We shall use different “distance” functions for the two spaces. We begin 
by defining the following symbols: 


n 
Id. = | >| | = if this limit exists; 
k=m 


n n 1/p 
= uP | + Do| xf | ; |lal] = if this limit exists. 
k=m k=m 

p is a positive constant not less than 1. For the sake of brevity, the de- 
pendence of these symbols on yu; and # is not indicated. The “distance” be- 
tween two points [b] and [c] is defined as |b—d. The “distance” between two 
points (b) and (c) is defined as ||b—cll. 

The symbols obey the following classic inequalities: 


lo + dn + 


For our purposes, a region in function space is simply the collection of 
points whose coordinates satisfy certain conditions. These conditions are 
usually given in the form of inequalities. Two very special regions Q and R 
will be largely used in this paper. They are defined as follows: 

Let g be a positive number. The point (x) belongs to the region Q(q) if 


(1.3) 


Let r be a positive number. The point (x) belongs to the region R (r) if for 
at least one value of t the inequality 


(1.4) \|x — $(2)|| < és valid, 


where ¢e(#), - - - have already been defined and (#) =(d/dé) 

The functions f,{t, x} which we consider are of a special type which we 
shall call “convergent.” A function of this type, depending upon an infinite 
number of variables, is defined as the limit of a sequence of functions, each 
one of which depends only upon a finite number of variables. To be more 
precise, we write a formal definition: 


Derinition. A function f{t, x}, defined for t in some interval, OS$t<T, and 
for (x) in some region S of (type 2) function space, is said to be of convergent 
type, if there exists a sequence of functions, y™{t, x}, m=1, 2,---, the nth 
function y {t, x} being defined for 0O<t<T and for all sets of values for x1, 


(1.2)* 


* Cf. F. Riesz, Les Systémes d’ Equations Linéaires a une Infinité d’ Inconnues, p. 43 et seq. 


1933] INFINITE SYSTEMS OF DIFFERENTIAL EQUATIONS 799 


xt, %2, X2,°-°+*, Xn, Xn which are the first 2n coordinates of any point (x) in S, 
such that for any fixed point (x) in S, 


tim {¢, x} = f{t, x}. 


The usefulness of this definition rests on the following lemma: 


Lema 1. Let be continuous in its 2n+-1 arguments. Let | {t, x}| 
<M, where M is some number independent of n, t, x1, xi, X2, x2,--+. Let 
a(t), x(t), --- be a set of functions, each of which is defined and of class C’ 
on OStST, and set xi (t) =(d/dt)x,(t). Let these functions be such that (x(t)) 
lies in S for 0StST. Finally let g(t) be defined and integrable on 0SitST. 

Then J7 f{t, x(t)} g(t)dt exists im the sense of Lebesgue and is in fact equal 
to limnnw {t, x(t) } 

The proof of this lemma together with the following corollary is left to 
the reader. 

CoROLLARY. Let wu be a constant. Then under the hypotheses of Lemma 1 
Si f{r, x(r)} sin w(t—7)dr is of class C’ for 0< ts T, possessing almost every- 
where in this interval a second derivative. 

We need two more simple inequalities before proceeding to the existence 
proof of the next section. 

If f(t) is integrable on 0 <¢ST, we have the classic inequality 


If s part forO <i <T. 
0 0 


Hence, if f,(é), fe(#), - - - is an infinite sequence of functions, each of which 
is integrable on 0ST, then 


0 


im 


t 
< dr forO Sts T. 
0 


If | {() |n,» is bounded uniformly with respect to and #, and if| |n,.. exists 
for each ¢, we can, by Lebesgue’s theorem, pass to the limit and write 


(1.6) f | f(r) | < f(r) 2dr. 
0 ™ 


2. Fundamental existence theorem for equations (1.1). We suppose that 
there exist four positive numbers r, A, B, T, such that the following three 
hypotheses hold: 


* E. W. Hobson, Functions of a Real Variable, 3d edition, vol. 1, p. 643. 


800 D. C. LEWIS [October 


Hypotuesis 2.1. The fi {t, x} are defined and are of convergent type for 
(x) in R(r) and for O<StST. The approximation functions y,™{t, x} are 
continuous and uniformly bounded for each k. 

Hyporuesis 2.11. |f{t, $(¢’)}|SB for O<t<T and <t’<+o. 

Hyporuesis 2.111. |f{t, —f{t, 2}]<A-||x— where (x) and (2) are 
both points of R(r) and OSiST. 

Then there exists a unique set of functions x(t), x2(t), --- , each of class C’ 
on the interval OStSK (K is the smaller of the two numbers T and 2-'»r 
/(Ar+B)) with the following two properties: 

I. (x(¢)) belongs to R for OSt<K. 

II. If these functions are substituted in (1.1) the right hand members exist 

in the sense of Lebesgue and are identically equal to the left members for 0OSis K. 


Such a set of functions will be called a solution. 
We first note as a consequence of Hypotheses 2.II and 2.III and (1.2) that 


(2.1) x}] Ar + B =C for (x) in R. 


The actual solution is constructed from the following system of successive 
approximations: 


(2.2) = 
(nm = 1, 2,3,---). 
Differentiating these, we have also 
(2.3) = de J 
= of + f felr, (r)} cos — r)dr. 
0 


We prove by induction that, for 0<#< K, x, (#) exists and is of class C’ 
(cf. Lemma I and its corollary), and that (x)(#)) belongs to R. Assuming 
these facts true for (x‘"—))(#)), it follows from (2.2), (2.3), and (1.6) that 


— = | [iste sin w(t — 


+4 | } cos u(t — 


INFINITE SYSTEMS OF DIFFERENTIAL EQUATIONS 801 
t Pp t 
=< 2| f | f{r, | < f If{r, 
0 0 
t 
< f Code = 2C?t? S r?, i.e. (x (#)) belongs to R. 
0 


Since the stated facts are obviously true for x, (¢), they are by induction 
true for 

We next prove that x,)(¢) converges uniformly toward a limit function 
ax(¢) asm becomes infinite. By setting m=1, we have from the above inequal- 
ities 

We also obtain from (2.2), (2.3), (1.6), and Hypothesis 2.III 


— = | f — ] sin w(t — 
0 


+ [ff T; fi T; (7) } ] cos u(t — 


fl 2} se, | 
0 


t 
2A?-1 f — Pdr. 
0 


It then follows by induction that 
PA P(n—1)gnp 
< 
(p + 1)(2p +1) ---([m — 1]p +1) 
The uniform convergence of x,")(¢) and x,)’(#) for O<¢<K now follows 


from the Weierstrass test. 
We also find, using (1.2), that 


I|x(t) — < — x«(™(£)|| 
2(m+1)/PCA mK 
2 [(p + 1)(2p + 1) - - (mp + 


which is the remainder after (n—1) terms of a certain convergent series of 


802 D. C. LEWIS [October 
positive constants. Now |f{#, x(¢)} }|<A-||x(¢) —2™()]], so that 


lim fi {t, x™(t)} = 


uniformly for 0<i<K. 

This is all that is needed to complete the proof that the functions x,(é) 
satisfy equations (1.1). The proof of the uniqueness of this solution follows 
essentially the same lines and is left to the reader. 

3. Approximation by solutions of finite systems. Let m be a positive 
integer. Then corresponding to the infinite system of equations (1.1) there 
is also the finite system, 


1 t 
(3.1) = ox (2) + — f fr, an(r)} sin — r)dr (k = 1,2,---, m) 


for determining the unknown functions x,,(é). It is the main object of this 
article to consider, under certain hypotheses, the approximation to %(é), 
- ++, Xa(#), which are the first » functions of the solution of (1.1), by the 
functions Xn2(f),-- +, Xnn(é), which form the solution of (3.1). We 
prove the following 


THEOREM. Suppose that the limits a | and | exist, and that there are four 
positive numbers r, A, B, and T independent of n so that the following hypotheses 
hold: 

Hypotuesis 3.1. The convergent functions f,{t, x} are defined in a region 
Q(q), where g=r+2"? and for 0<tST (see (1.3)). The approxi- 
mation functions have the special form Wi™{t, x}=filt, x1, xf, 
Xn, Xn, 0,0,0,---), R=1,2,---. 

Hyroruesis 3.11. |y™ {t, o(¢’)}|SB for OStST, <i’ < +40. 

Hyporuesis 3.111. | —f{t, #}| <A -||x—4l| for0 <t< T and for(x) and 
(#) in Q(q); felt, x} is continuous in t. 

Then we may draw the following four conclusions: 

Conciusion I. R(r) <Q(g). 

Conctusion II. Hypotheses 2.1, 2.11, 2.111 of the preceding article hold and 
consequently a unique solution, x(t), x2(t),--- , of (1.1) exists for OSt<K. 

Conciusion III. A solution xm (t),---, Xnn(t) of equations (3.1) also 
exists for such that and |y™ {t, 
=C. 

Conctusion IV. lim,_,, ||*(¢) lim,_, lim,__ 
{t, xn(t) } Bus... dt =0, the first two of these limits holding uniformly for 
Ost<K. 


? 


1933] INFINITE SYSTEMS OF DIFFERENTIAL EQUATIONS 803 


Conclusions I, II, and III are sufficiently obvious to require no proof. 
We content ourselves with a proof of IV. 
From Hypotheses 3.I and 3.III we have 


lit, x} ym {t, #}|? A>||x 


(3.2) + A?||x|| 241.0 if (x) and (2) both belong to Q. 


Let ¢ be a preassigned positive number arbitrarily small. Choose W; so 
large that 


(3.3) +-1,20 + Ja Newt, e for ” = Ni. 
Inequality (2.1) holds, and it is easy to justify the relation 


K 
f If{r, x(z)} [pdr = > f | fe{r, x(r)}|"dr Ar+B). 
0 k=1 0 


Consequently it is possible to find N2 so great that 
K K 

(3.4) f If{r, = > | felt, x(r)} |"dr < for m= Nz. 
0 k=n+1 0 


Let N3 be the greater of the two numbers WN, and N-z so that both (3.3) and 
(3.4) hold as long as n= N3. Now, from (1.1) and yin we have 


|| — 2 | f{r, x(r)} 


in+1,0 


< axes If{r, 2K". 
0 
Also 
|| x(¢)|| < ||x(2) = (2)]| n41, + o> S + 


But since it is easy to show that 2" 
we have ||x(2)|| n41,.0S eas long as n= N3. 

Since WN; is independent of ¢ for 0<i<K, we have proved the second 
relation under IV. 

For convenience choose a number JN, so large that 


(3.5) || € as long asm = Ng. 
Now set up the successive approximations for equations (3.1): 
tne (t) = ox(t), 


= 
{ 


804 D. C. LEWIS 


We find now from (1.1) and (1.5) that 
0 


and this by (3.2) is less than or equal to 


t 
f AP. ||x(7) — (r)]| 247-2 f 
0 0 


Therefore we obtain the following inequality: 
t 
(3.6) ||x(t) — S 2A 707-1 f — + 2A 
0 


which is valid for O<S¢S K and n=N,. Furthermore it is clear that 


| x(r) — 


Hence setting m=1 in (3.6) we obtain 


PPP 


It is now easy to prove by induction that 
2mA 
(p + 1)(2p + 1) --- ([m — 1]p + 1) 
m 
| (O-p + 1)(1-p + 1)(2-p+1)--- + 
Since 0</<K, we surely must have now 
2mA mp 
where D is equal to the value of the convergent series of positive constants 
iv 
(0-p + 1)(1-p + 1)2-p +1) 


Also it is known that 


— 


lim = ane(t) 


uniformly, and the limit of the first term on the right, as m increases indefi- 
nitely, is zero. 


[October 


1933] INFINITE SYSTEMS OF DIFFERENTIAL EQUATIONS 805 


Hence ||x(#) —2x,(é)||1,.<(€D)", as long as m=>N,. This proves the first 
relation under IV. 
By (1.2) we have 


K 


(3.7) 
+ f W{r, — x(r) 


We first appraise the first term on the right: 


eK?" by (3.4) forn = 


Thus the first term on the right of (3.7) is not greater than (eK?-')”/? for 
n= N;. The second term on the right of (3.7) is appraised by means of (3.2): 


S «A»(D + 1) forn = Ng. 


Thus the second term on the right of (3.7) is not greater than KA(e(D+1))"» 
for n= N,. Therefore 


where JN, is the greater of the two numbers NV; and N,, where C=Ar+B, and 
where E=C?-![KA(D+1)"/?+K?-»!?]. Also, from the obvious relations 


it follows that 


Ip 
» forn 2 Ns, 


1 
C1 


K 
f an(7) } Ee'’?, as long 2 Ns. 
0 
This completes the proof of the theorem. 


Part II 


4. The application of the results of Part I to partial differential equations. 
We consider partial differential equations of the form 


(4.1) 


| 
ar ay? 
A 


806 D. C. LEWIS [October 


where u, y, #) is defined for —~ <pi< <h, 
0sySz, 0S¢ST; is uniformly continuous in y and ¢; and obeys a Lipschitz 
condition in f:, p2, and u: 


(4 2) | Pa, U, t) — F(fi, Da, t) | 
as long as |u| <h, |a| <h. 

We retain the notation of §1 with the understanding, however, that p=2 
and 


Let 
6! 


( 


Let there be given two functions, f(y) and g(y), defined for the interval 
0<y<Sz and subject to the following conditions: 

f(0) =f(x) =0. f(y) is an indefinite integral possessing a derivative whose 
square is summable, and furthermore 


2 , 
[f'(y) Pay < 
w Jo 
g(y) has a summable square, and furthermore 
2 
f [g(y) < 
Jo 
We shall try to find a solution of (4.1) such that 
Ou 
4.3) = = 0, = so), = 
Ot | 


We know from the theory of trigonometric series that 


2 2 
=f [f'(y) = Jou]? < where = ark = f'(y) cos ky dy 
wT T Jo 


and 
2 2 
=f = < where af = g(y) sin ky dy. 
us 0 0 


Remembering that f(0) =f(7) =0, we find on integrating by parts that 


2 
a, = =f sin ky dy. 
Jo 


* 


1933] INFINITE SYSTEMS OF DIFFERENTIAL EQUATIONS 807 


We also have the obvious inequality 21/*( | <q. Let r=q—2"*( 


+|o']). 
Let (x) be a point of Q(g). Then from Schwarz’s inequality we have 


= 1 1 T 
Sle! s |—|< lal 


Hence the series u(y)= > cy=1 x, sin ky converges absolutely and uniformly 
with respect to y. With the help of the Riesz-Fischer theorem we may now 
make the following statement: 


Corresponding to a point (x) in Q there is defined a continuous function, 
u(y), for the interval OS yS7, possessing almost everywhere a derivative u'(y), 
whose square is summable, such that 


2 
= — f u'(y) cos ky dy, u(y) = f u'(n)dn, u(0) = u(r) = 0, | u(y) | <A. 


Corresponding to this same point (x) there is also defined for this interval a 
second function v(y) whose square is summable and such that 


2 
“6 = ~f v(y) sin ky dy. 
Jo 


u'(y) and v(y) are unique on OS yz except possibly for point sets of measure 
zero. 


Now let 


2 
(4.4) felt, 2} = — f F(u'(y), o(y), u(y), #) sin ky dy. 


Let [u(y), v(y) ] and [a(y), 5(y) ] be the two pairs of functions correspond- 
ing respectively to the two points (x) and (#) in Q. From (4.2) we have 
[F(u'(y), o(y), u(y), 9, 4) — F@@’(y), O(y), ay), », 

3a*-[u’(y) — a'(y)]? + 38?- [v(y) — O(y)]? + [u(y) — a(y) 
Now the x} are the Fourier coefficients of F(u’(y), v(y), u(y), while 
the f,{#, #} are the Fourier coefficients of F(a’(y), (y), #(y), y, #), and hence 
the [f.{t, «} —f.{t, 2} ] are the Fourier coefficients of 
[F(u'(y), o(y), u(y), y, 4) — F@'(y), O(y), a(y), #)]. 


Hence 


4 


808 D. C. LEWIS 


2 
Vis 2} se =— fw), 00), » 0 
F(a'(y), a(y), ii(y), t) 


6a? Pa 6 2 
f [w'(y) — ay) Pay + f [v(y) — Pay 


6y? 
2d 
+= [u(y) — a(y) Pay 
= 3a?- |(x + 38?- z'? + 3y?-|x - a? 


(3a? + 3y%)-[(x — + — 

S — 
where A? is the greater of the two numbers (3a?+37’) and 36”. Thus we have 
(4.5) x} — sft, 2}] 
Also |f {t,x} ] < | f {2,0} | whichis obviously bounded for (x) in Q. 
Finally we write 
(4.6) vi {t, x} = filt, x1, Xn, Xn, 0,0,0,0,---), 
from which it follows, using (4.5), that 

lim {t, x} = x}. 


Thus we easily see that the f,{#, x} as here defined satisfy Hypotheses 
3.1, 3.II, 3.III. Hence all the results of Part I are now available. 

It will be necessary to generalize our idea of a “solution” of a partial 
differential equation. We first make the following 

DEFINITION. A continuous function, u(y, t), defined on the rectangular 
region 0S OStSK, and possessing first partial derivatives almost every- 
where in this rectangle, is a solution in the generalized sense of the second- 
order partial differential equation P(u) =0, if there exists a sequence of functions, 
ur(y, t), ualy, t), us(y, £), -- + , each of class C’’ in this same rectangle, such that 
the following four conditions hold: 


(I) lim u,(y, t) = u(y, t) uniformly in y and t; 


P 
lim f (= _ | dy = 0 uniformly in t; 
neo Yo LOY OY 


li f 0 uniformly int 
im — = 0 uniformly in t; 


[October 


1933] INFINITE SYSTEMS OF DIFFERENTIAL EQUATIONS 


K 
(IV) lim f [P(un(y, dt = 0. 


In the next article there will be given a more general definition, which, 
however, for our present purposes is equivalent to the definition above. 

Let (x(#)) be a solution, valid for 0<#<K, of the infinite system of 
equations (1.1). We know from §2 that such a solution exists and is in fact 
unique. Then the function 


(4.7) u(y, t) = sin ky 
kewl 


is a solution (in the generalized sense) of (4.1) and satisfies the conditions (4.3). 
The approximation functions, u,(y, #), mentioned in the definition will be 
provided for as follows: 


(4.8) un(y, t) = > xax(t) sin ky, 
k=l 
where %m(#),---, Xnn(é) satisfy the finite system (3.1). Since there is no 
difficulty about differentiating the finite sum in (4.8), we see that u,(y, é) 
satisfies the requirement of being of class C’’. 
It is intuitively evident that u(y, #) possesses almost everywhere the 
partial derivatives 


Ou 


Ou 
—= oxi sin ky and — = cos ky, 
OY 


ot k=l 


where the indicated series converge in the mean on the interval 0S y<z for 
each t. They do not necessarily converge in the usual sense. The rigorous 
proof of these facts is omitted because it merely involves some of the funda- 
mental classical analysis concerning double limits and convergence in the 
mean. 

We notice also that u(y, #) and du/dt take on the preassigned initial values 
of (4.3). For x,(0) =a, and x (0) =a;. 

It remains to show that u(y, #) and the u,(y, #) satisfy the Conditions I, 
II, III, IV of the definition. 

Proof that Condition I holds. Let ¢ be an arbitrarily small positive number. 
We have 

| u(y, t) — un(y, é) | = (t)] sin ky + p x,(t) sin ky 


k=n+1 


< >| a(t) — + > | | . 


k=n+1 


809 
: 
4 
q 
i 


810 D. C. LEWIS 


Using Schwarz’s inequaity we find that 


and this may be taken less than $e, satan to the theorem of §3, if is 
chosen greater than some number N,, independent of ¢. Likewise from 
Schwarz’s inequality and the theorem of §3 we have 
1 

if m is greater than some number N2, independent of ¢. Hence | u(y, 2) 
—un(y, t)| <e for n>Ns3, where N; is the greater of the two numbers N, 
and N.. 

Proof that Condition II holds. The function @u/dy—du,/dy) has the 
Fourier coefficients (x,(#)—an.(t))k, for R=1, 2,---, m, and x,(é)k, for 
k=n+1, n+2, n+3,---. Consequently 


=f =) dy =|[x() — + 


and this by §3 converges uniformly to zero. 

Proof that Condition III holds. The function (@u/dt—du,/dt) has the 
Fourier coefficients for k=1,---, m, and (é), for k=n+1, 
n+2,---. Consequently 


Otte 
=f dy =|" — Of, 


which converges uniformly to zero as above. 

Proof that Condition IV holds. Differentiate u,(y, ¢) and substitute in the 
operator, P( ). P[un(y, é)] thus is a function of y and ¢. It is found from 
(4.1), (4.4), and (4.6) that 


=f" Plun(y, t)] sin ky dy = Xnk + — {t, an(t)}, 
fork = 1,2,--- 


and this vanishes on account of (3.1). Also 


2 
fork 


[October 
n, 


1933] INFINITE SYSTEMS OF DIFFERENTIAL EQUATIONS 


Hence, by Parseval’s theorem, 


2" 


and integrating this with respect to ¢ between the limits 0 and K we get a 
quantity which, by the theorem of §3, may be taken arbitrarily small by 
taking » sufficiently large. 

5. A study of the conception of a “solution in the generalized sense”’ 
with special reference to partial differential equations of the form (4.1). 
It will be shown in this section that a solution, u(y, #), of (4.1) in the general- 
ized sense is also a solution in the ordinary sense, provided that u possesses 
second derivatives and F satisfies certain simple requirements as to con- 
tinuity and differentiability. It will also be shown that the solution in the 
generalized sense obtained in the previous section is the only such solution 
which satisfies the boundary conditions (4.3). 

It is easy to prove these theorems for equation (4.1) because of its es- 
pecially simple structure. But the generalized notion of a solution of a partial 
differential equation is naturally of a much broader character. It might very 
well prove useful in the treatment of all partial differential equations, es- 
pecially those of the hyperbolic type. I give here a complete definition, 
slightly more general than the one introduced in §4, which was not sym- 
metrical in y and ¢. 

DEFINITION 1. A continuous function, u(y, t), defined in some finite region 
E and possessing almost everywhere in E first partial derivatives, is a solution 
in the generalized sense of the second-order partial differential equation P(u) =0 
if there exists a sequence of functions, u(y, t), ua(y, t), us(y, t),--- , each de- 
fined in E and each of class C'’, such that the following four conditions hold: 


(I) lim u,(y, t) = u(y, t) uniformly in E; 


Ou 


ou 
(III) lim f dy dt = 0; 


(IV) tim f 


In the sequel, £ will be assumed to be a closed simply connected region, 
whose boundary consists (say) of a finite number of arcs of analytic curves. 


811 

4 

4 

‘ 
i 


812 D. C. LEWIS [October 


In comparing this definition with the one given in §4, it will be observed 
that the exponent 2 has been omitted from the integrands in II, III, IV. 
However, Schwarz’s inequality shows us that, if 


lim Jf =o, 
n=o E 
lim ff =o, 
E 


provided that £ is finite. In other words, a function u(y, #) which satisfies the 
conditions of the former definition will surely satisfy the conditions of this 
last definition. 

From the fundamental facts about convergence on the average, the reader 
will readily verify the truth of the following 


then also 


Lemma 1. Let (y’, t’) be an interior point of E and 6 a sufficiently small 
preassigned positive number. Then it is possible to find a subsequence, un*(y, t), 
of the sequence u,(y, t), such that for almost all choices of to in the interval 
—to| $8 we shall have 


where E,, denotes the cross-sectional point set obtained from E by putting t=to, 
and where u(y, t) and u,(y, t) satisfy the requirements of Definition 1. 


to 


DEFINITION 2. Let u(y, t), defined on E, be a solution of P(u) =0 in the 
generalized sense, and let u,(y, t) be the approximation functions introduced in 
Definition 1. Then the cross-sectional point set E,, obtained from E by setting 
t=t is called a proper line, if there exists a subsequence u,*(y, t) such that 


Ou 
J By ot ot 


If such a subsequence does not exist, E,, is called an improper line. 
According to Lemma 1, almost all cross-sections are proper lines. 


DEFINITION 3. A solution, u(y, t), of P(u) =0 in the generalized sense will 
be said to assume the initial values f(y) and g(y) for t=to, if both the following 
conditions hold: 

(I) is proper line; 

(IT) u(y, to) =f(y), and 


almost everywhere on E,,. 


Ou 
ot ot 
(y) 
ri 


1933] INFINITE SYSTEMS OF DIFFERENTIAL EQUATIONS 813 


The subsequence u,*(y, #) mentioned in Definition 2 will then be such 
that 
lim f — — g(y)|dy = 0. 
Et ot 
We temporarily consider the simple equation 
(5.1) 
ay? 
Lema 2. If u(y, t) is a given function satisfying (5.1) in the ordinary sense, 
then 
u(y, t) = +t — bo, to) + — t + bo, to) 


(5.2) 1 utt—to dy 1 
—(n, + — f f $(n, r)dndr. 
2 tri(y,t,to) 


2 y—t+t, 
to is arbitrary except as restricted below. The region of integration for the double 
integral on the right, denoted by tri (y, t, to), is the triangle in the (n, r) plane 
with vertices at the following points: (y, t), (y—t+to, to), and (y+t—to, to). 


We also require that tri(y, ¢, é) shall lie entirely within the region of 
definition of ¢ and u, and that ¢ shall be integrable, so that the right hand 
side of (5.2) will have a meaning. 

This well known lemma can be easily proved by applying the following 
formula, deduced from Green’s theorem, to tri (y, #, fo): 


where S represents any closed region in the (n, 7) plane and C represents the 
boundary of S taken in the proper sense. For later convenience I have 
written 7 and 7 as the variables of integration instead of y and ¢. That is, in 
the above integrals, I regard u(y, #) and its partial derivatives as being 
evaluated for y=n and ¢=r. In the sequel, the notation will frequently be 
changed in this way, whenever no confusion is likely to result, with no 
further comment. The actual proof of the lemma is omitted. 

From Lemma 2 we see that the non-linear partial differential equation 
(4.1) is closely related to the following non-linear integro-partial differential 
equation: 


(5.3) u(y, t) = 4u (y+t—to, to) + 3u (y—t + bo, to) 
4 1 1 ff Ou (1,7) |e a 
n —») Un,T), 0, T 
2 ot 2 tri (y,t,to) dy ot 
which we now proceed to consider. 


| 

; 

4 

» 

ne 


814 D. C. LEWIS [October 


Derinition 4. A function u(y, t), defined in E and admitting almost every- 
where first partial derivatives, is a solution of (5.3) at the point (y’, t’) in E, 
if, for almost all values of to (termed “proper” values) in the neighborhood of t’, 
(5.3) is an identity in y and t in the neighborhood of (y’, t’). It is a solution 
throughout E, if it is a solution at every interior point of E. 


Equations (4.1) and (5.3) are only partially equivalent because a solution 
of (4.1) in the ordinary sense must possess second derivatives, whereas a 
solution of (5.3) need not possess second derivatives. We shall see, however, 
that (4.1) and (5.3) are completely equivalent, if by a solution of (4.1) we 
mean a solution in the generalized sense, provided that F shall satisfy certain 
simple conditions. 


TueEoreM 1. Let F(p:, po, u, y, t) be defined for all p: and pz, for |u| Sh, 
and for (y, t) in E. Let F also obey the Lipschitz condition 


| F(p, p, u, y, — p, y, | pil +6| pel 


for |u| <h and |u| <h. Then a solution u(y, t) of (4.1) in the generalized sense, 
such that | u(y, t)| <h, is also a solution of (5.3). 


There exists a sequence of functions u,(y, #) satisfying the conditions 
of Definition 1. Because of Condition I we may assume without loss of 
generality that <h, n=1, 2, 3,---. 

Let (y’, t’) be any interior point of EZ. It is possible to choose a positive 
number 6 and a neighborhood U of (y’, t’) such that, if (y, #) is a point of 
U and ty satisfies the inequality | ¢’—¢| <4, the triangular region tri(y, ¢, to) 
will lie completely imbedded in £. 

In accordance with Lemma 1 we have for almost all choices of é the fol- 
lowing relation: 


* 
(5.4) lim f = dy = 0, 
n=o Ey, at 


where u,* (y, #) is acertain subsequence of the given sequence, u,(y, é). In 
other words, ¢=¢) is a proper line. Choose any such proper line such that 
|#/—to| <5. Then hold éy fast. 

If we set 


2) ay 


we know from Definition 1, Condition IV, that 


Pig 
at ny y,t 


1933] INFINITE SYSTEMS OF DIFFERENTIAL EQUATIONS 


(5.5) tim | | andr = 0. 


Also from Lemma 2 we have the following identity in y and ¢ in the neighbor- 
hood U of (y’, ¢’): 


Un (y, t) = + t — to, to) + — t + to, to) 


1 utt—to 
+— to)dn 
2 y—t+t, ot 


tri(y,t,to) ot 
+— f $n(n, 7)dndr. 
2 tri(y,t,to) 


Now consider the function 


w(y, t) = +t — to, to.) + — t+ bo, to) 


(5.7) 1 utt—to ou Ou 
+— to)dn + —ff T nar. 
2 tri(y ot 


The fact that the double integral on the right actually exists follows from 
the fact that 


OUn 
Un, 1, T + lends 


is assumed to exist, since u was by hypothesis a solution in the generalized 
sense. It is easily proved in virtue of the Lipschitz condition and Conditions 
I, II, and III of Definition 1 that F@u,/dy, du,/dt, un(y, t), y, 4) converges 
in the mean, as ” tends to infinity, to F@u/dy, du/dt, u(y, t), y, 2), and hence 
this latter function is integrable. 

We shall show that w(y, 4)=w(y, ¢) in the neighborhood U of (y’, #’). 
Subtracting (5.6) from (5.7) we get 


w(y,t) — unt(y, t) = 4 {u(y +t — to, to) — +4 — to, to)} + — t+ foto) 


ytt—t, 
ot 
Ou 
2 tri(y oy oy 
1 
2 tri(y,t,to) 


815 
i! 
= 
f 
| 
| 


816 D. C. LEWIS [October 


Let ¢€ be an arbitrarily small positive number. Then the sum of the absolute 
values of the first two terms on the right can be taken less than }e by taking 
n sufficiently large (independently of y or #) because of Condition I of Defi- 
nition 1. The absolute value of the third term can be taken less than }e 
because of (5.4). The absolute value of the fifth (last) term can also be taken 
less than }e by (5.5). And finally applying the Lipschitz condition to the 
fourth term we have 


3 1 
| wy, 1) may, )| < ff 


tri(y,t,to) 


1 
2 tri(y,t,t) 


1 
: tri(y,t,to) 


Hence from Conditions I, II, III, we have | w(y, #) —ua*(y, t)| <e, if n>N’ 
where N is a number depending only upon e. In other words u,*(y, ¢) tends 
uniformly to w(y, #) in U. Since, however, u,* is a subsequence of u,, which 
by hypothesis converges uniformly to u(y, #), it follows that w(y, t)=u(y, #) 
in the neighborhood of (y’, ¢’). 

Hence wu is a solution of (5.3) at the point (y’, ¢’). Since (y’, ¢’) was any 
interior point of E, u is by Definition 4 a solution throughout E£ of (5.3). 

THEOREM 2. Let y’, t’, t; be any three real numbers determining a closed 
triangular region, tri(y’, t’, t:), and such that >0. Let F(pr, pa, u, 
be defined for all values of y and t which are coordinates of points in tri (y’, t’, th); 
for |u| <h, and for all values of p: and p. whatever. And suppose that it satisfies 
the Lipschitz condition 


| F(pi, Pa, %, — F(pr, pa, 4, y, | 


for |u| <h, |u| <h. 

Let f(y) and g(y) be defined for y’—T <ysy'+T. Let f(y) be an indefinite 
integral of a function f'(y), and let g(y) be summable. 

Then there can not be more than one solution, u(y, t), defined on tri(y’, t’, ts), 


of the equation 
u(y, 4) = 3f(ytt—h) + 3fly—tt+h) 


1 (n)dn + 1 ff Ou d 
2 2 tri(y,t,.t;) oy at 


y—t+t, 
for which |u| <h. 


(5.8) 


ou = 
— — — |dndr 
dy dy 
Ou Oun 
— — —— |dedr 
ot Ot 


1933] INFINITE SYSTEMS OF DIFFERENTIAL EQUATIONS 817 


Let ¢) and be two such solutions of (5.8). 
Differentiating, using the Lipschitz condition, and executing other ob- 
vious operations, we establish 


2 |u(y, 4) — o(y, | 

y’—t’+t Ot ot ty y’—t’+r oy oy Ot ot 


av 
f — (n, 1) — +: 
LOY oy 


The last two of the above three inequalities hold for almost all values of ¢ 
on 4, StS’. Let & be one of these non-exceptional values, not greater than 
th+3/(a+8+Ty). Let M be the least common upper bound of the left 
members of the above inequalities for the non-exceptional values of ¢ on the 
interval ¢, <¢<#, and for y restricted so that the point (y, #) lies in tri(y’, ¢’, 4). 
Hence, for some such point (y*, é*), one at least of these left members is not 
less than 2M, and we thus obtain 


$M < M(a + 8)(te — th) + 2T(t2 — th) y¥M/2 
= (4 — h)(a+B+Ty)M S 3M. 


Therefore M =0. 
It follows that u(y, #)=v(y, ¢) in the part of tri(y’, t’, :) which lies below 


the line ¢=%. By a repetition of this argument the reader can readily extend 
this result to include the whole of tri (y’, ¢’, 4). 

Evidently Theorems 1 and 2 can be used to prove the uniqueness of the 
function u(y, é) of §4, which satisfies (in the generalized sense) the differential 
equation (4.1) and obeys the conditions (4.3). 

In order to apply Theorem 2, however, it is first necessary to extend the 
definitions of F(p:, po, u, y, #), f(y), g(y), and u(y, #), which in §4 were re- 
garded as defined only for O<y <7. It is necessary to do this so that every 
point in the rectangle OS ySa, OStSK may be imbedded within a tri- 
angular region tri(y’, ¢’, 0). This extension can obviously be effected in a 
variety of ways. For example, we could define F, f, g, u outside of OS yS7 
by making them periodic in y with period z. This definition alone may give 
conflicting values for F at points for which y=kr, k=0, +1, +2, +3,---. 
So we shall redefine F at these points by writing F(p:, po, u, kr, t)=0. With 
this extended definition, u(y, ¢) furnishes us with a generalized solution of 
(4.1) which is valid for -NxrsyS+WNrz, 0StsK, where N is an arbitrarily 
large integer. In order to see this it is only necessary first to extend the defi- 
nitions of the approximation functions u,(y, 4) by making them periodic in 


818 D. C. LEWIS [October 


y and secondly to modify them slightly near the points for which y=kz, so 
that they may be of class C’’ throughout in accordance with Definition 1. 


Part III 


6. Statement of the problem for the parabolic partial differential equa- 
tion. We now treat certain partial differential equations of parabolic type 
by the methods developed in Parts I and II. A different point of view is as- 
sumed, however, in that no use is made of a solution in a generalized sense. 
M. R. Siddiqit, using the methods of Lichtenstein, has treated parabolic 
equations of a more restricted type and for less general initial conditions. 
The present methods are also simpler than Siddiqi’s; but on the other hand 
Siddiqi’s solution is valid for 0<t<, whereas the present solution is de- 
fined only for a sufficiently small interval 0<t<K>0. The inequalities 
upon which the present work is based seem to admit considerable latitude, 
but I have been unable to obtain the extension to the infinite interval. 

Since practically no repetition is involved, this part of the paper is 
written so that it can be read independently of Parts I and II. 

We shall consider a partial differential equation of the form, 


(6.1) ou (= 
ay? ay. 


where F(p, u, y, t) is defined for |p| <P, |u| <U, OSy<zm, O<tST. It is 
continuous and possesses continuous partial derivatives with respect to 
p, u, and y. Furthermore we assume either that 


(6.2) F(p, 0, y, = 0 
or that 
(6.2’) F(p, u, 0, t) = F(p, u, r,t) = 


As a consequence of the existence of the continuous partial derivatives 
we may also write the following Lipschitz condition: 


(6.3)  |F(p, 4, —F(@, 4, y, t)| Sa |p—p|+8-|u—al, 


which is valid for the domain of definition specified above. 
We have given a function f(y), defined on 0< yz, vanishing at the end 


t See bibliography given in the Introduction. 
t In the equations treated by Siddiqi, F is developable in a power series in u and p: 
O,1,-+-, 


F(p, u,y, t) = 


ap 
where, however, a@ is not allowed to take on the value 0. Consequently Siddiqi’s equations satisfy 
(6.2). 


1933] INFINITE SYSTEMS OF DIFFERENTIAL EQUATIONS 819 


points of this interval, possessing an absolutely continuous first derivative, 
and having almost everywhere on 0 Sy <7 a second derivative whose square 
is summable. This is equivalent to the assumption that f(y) can be developed 
in a trigonometric series f(y) =)_;-1 az sin ky, for which >> a2k* converges. 
We wish to find a function u(y, #), defined for OS ySa, OStSK, which 
satisfies (6.1) and the boundary conditions 


(6.4) u(y, 0) = f(y), u(0, 4) = u(x, t) = 0 (O<KST). 


7. The related infinite systems of ordinary differential equations. We 
shall here prove an existence lemma for infinite systems of differential equa- 
tions of the type 


dx, 
(7.1) — = 2} (k = 1, 2, 3,---) 


under the initial conditions x,(0) =a,. Here, contrary to the notation of §1, 
a letter followed by {t, x} denotes a function dependent upon the infinitely 
many independent variables #, x1, x2, %3,---. The uw, are any infinite set of 
real numbers, such that 1 <p: Spi4i. We shall consider (7.1) in the form of 
the equivalent infinite system of integral equations: 


(7.2) = f fair, a(r)} ek 
0 


As in Part I, we shall use the following terminology and abbreviations: 
[x] stands for the infinite system of numbers represented by the symbols 
%1, X2, is an abbreviation for x2)", if this limit exists. It 
will be noted that this symbol obeys the inequality 
(7.3) jx + < |of + 
The ordered sequence [x] will frequently be regarded as a point in func- 
tion space. The region Q(q) will consist of those points [x] for which 
(7.4) < 9, 
where q is positive. The region R(r) consists of those points [x] for which 
(7.5) [u2(x — 


for at least one value of t>0 (r>0). 
We shall always assume that |y?a| exists and we shall take q=rt+|ual. 
Evidently, then, R <Q, as follows from (7.3). 


t Siddiqi assumes for one of his theorems that )_ &*| ax| converges and for the other theorem that 
converges. 


820 D. C. LEWIS [October 


We assume that f,{¢, x} is defined for 0S#ST and for [x] in some region 
Q(q). Furthermore we assume the following three fundamental hypotheses: 


Hyporuesis 1. If [x(t)] is any sequence of continuous functions whose 
range is in Q(q) for OSTtST, then fi {t, x(t)} is integrable on O<tST. 

Hyporuesis 2. A number B exists such that |uf{t, x}]<2'* B for [x] in 
Q(q) and for 0<iST. 

Hypotuesis 3. A number A exasts such that |f{t, x} —f{t, 2} A-|u(x 
— for 0<tST and for [x] and in Q(q). 

On the basis of these hypotheses we shall prove the existence of an infinite 
sequence of continuous functions [x(t)] defined for O<St<K (K is the smaller 
of the two numbers T and r°/B*) whose range is in R(r), such that when these 
functions are substituted in (7.2) the right hand members exist and are identically 
equal to the left. Such a set of functions is called a solution of (7.2). This solu- 
tion 1s unique. 

From Schwarz’s inequality and Lebesgue’s theorem for integrating in- 
finite sequences, it is readily verified that, if F,(¢) is a sequence of integrable 
functions such that }-%.; u2 [F;(¢) |? is uniformly bounded with respect to 
n and ¢ (for O<#ST), then 


~Jo 0 


where p=0 or 1 and OS/ST. 
We set up the successive approximations 


2 . 2 
0 


Obviously [x (t)] is in R(r). Assume for the moment that [x‘—(#)] is 
in R(r) for 0St<K; then, if the x," (é) are also continuous, we have 


1 t 
f luf{r, } Par 
2 Jo 
Bts rfor0 Sis T, r*/B*. 


Hence we have shown by induction that [x (¢)] lies in R(r) for O<S¢<K, 
and that the x{”(¢) are continuous. 
Also 


INFINITE SYSTEMS OF DIFFERENTIAL EQUATIONS 


< Ju? [x (2) Bt. 


Using (7.6) and Hypothesis 3, we get 


We thus have a recursion formula which enables us to prove by induction that 


A 2(n—1) B2jn 


n! 


By the Weierstrass test we see that the successive approximations converge 
uniformly: lim,... x. (¢) =x,(¢) uniformly, this relation being regarded as 
defining the x,(¢) for K. 
Using (7.3) we have 
2m R2K m+ly 1/2 
m=n m=n (m + 1)! 


Then from Hypothesis 3, we have 


A2mB2K m+1 1/2 
— sft, @}] 2024 ( (m + 1)! ) 


And, since the right hand member of this inequality may be taken arbitrarily 
small independently of (0 <#< K), we have lim,-« fi {t, x«™(¢)} =fi{t, x(é)} 
uniformly. It is therefore obvious that [x(é)] constitutes a solution of (7.2). 
It also satisfies (7.1) almost everywhere on 0<it<K. 

The proof of the uniqueness of this solution follows essentially the same 
lines and is left to the reader. 

8. Application to the partial differential equation. In applying the 
results of the preceding article we shall take u,=k; g=the lesser of the two 
numbers 6? and 61 P/z. And we shall assume (as in §7) that 
=r>0. We define the f,{t, x} as follows. 

Let [x] be an arbitrary point of Q(g). \J<¢g. Then from Schwarz’s 
inequality we have 


1933] 821 


D. C. LEWIS 


|x|) < ( 


k=l 


It follows that the two series u’(y) kay cos ky and u(y) =>. sin 
ky converge uniformly. Moreover | «’(y)| <P and | u(y)| <U. By the Riesz- 
Fischer theorem we also have a function u’’(y), defined almost everywhere 
on 0<ySz7, such that its square is summable on this interval and such that 


(8.1) fw") sin kydy = — k?x, 
0 

and 

(8.2) Pay = ap 


Integrating (8.1) by parts, it is easily seen that u’’(y) is really the second 
derivative of u(y), a fact anticipated in the notation. Thus corresponding to 
any point [x] in Q(g) we can write down a function u(y) with the properties 
enunciated above. We are now in a position to use this function to define 


(8.3) felt, x} = f Flu’), u(y), y, t] sin ky dy. 


Integrating this by parts we have 
cos 2 cos ky 
x} = lw (y), u(y), 9, +—f dy. 
0 dy k 


Hence from (6.2) or (6.2’) we get 


kfx{t, a} = =f" + kydy. 


Therefore 
2 aF aF 


Now the right hand side of this equality is bounded (with respect to [x]) 
because of two facts: I. The partial derivatives of F, being continuous in a 
closed region, are bounded. II. By Schwarz’s inequality and (8.2) we know 
that any expression of the form {7G(y) u’’(y) dy is bounded whenever G(y) 


822 [October 
x 1/2 20 1 1/2 
kewl 
< < Uand P. 
61/2 61/2 


1933] INFINITE SYSTEMS OF DIFFERENTIAL EQUATIONS 823 


is bounded. Hence Hypothesis 2 of the preceding article holds for the defi- 
nition of f,{t, x} given in (8.3). 

Now let [x] and [#] be two points of Q, to which correspond respectively 
the two functions u(y) and #(y). We have from (6.3) 


2 
=} — | F[u'(y), u(y), », #] — Fla’(y), Pay 


2 
‘= f [2a2- | u’(y) — a’(y) | 2+ 262- | u(y) — a(y) | 
T 0 


= 2a? — ki,)? + 26? — 


kml k=l 
S 2(a? + — ap. 
Hence Hypothesis 3 holds with A = (a?+?)/”. 

The proof that Hypothesis 1 holds is easy and is left to the reader. 

Hence all the results of §7 are now available. Using the x;(¢), the existence 
of which was asserted in that article, we form the function u(y, t)=.,_, 
x,(t) sin ky. Then u(y, t) satisfies (6.1) almost everywhere in the region 0 Sy <r, 
0<t<K, and also fulfills the boundary conditions (6.4). Furthermore u(y, t) 
is the only function of class C' in y and possessing derivatives du/dt, 0°u/dy* 
almost everywhere, such that |u(y, t)| SU, |du/dy| <P, which enjoys these 
properties. 

In the first place by (7.1), (7.4), and Hypothesis 2, it is seen that pl 
[dx,/dt}* converges. By the Riesz-Fischer theorem we can therefore find a 
function w(y, 4) whose Fourier coefficients are precisely these dx,/dt. It is 
now an elementary exercise in analysis to identify w(y, ¢) with du/dt. 

The theorem follows from the fact that the Fourier coefficients of 


ou ) 
— —-——F|(—> 4, y,t 
a dy 


are none other than the 
dx; 
kx, — felt, x}, 


all of which vanish. The uniqueness part of the theorem follows from the fact 
that the assumption of the existence of a second such function, @(y, 2), 
leads to a contradiction of the uniqueness of the functions [x(i) ]. 


CAMBRIDGE, Mass. 


ON DEFINITIONS OF BOUNDED VARIATION FOR 
FUNCTIONS OF TWO VARIABLES* 


BY 
JAMES A. CLARKSON AND C. RAYMOND ADAMS 


1. Introduction. Several definitions have been given of conditions under 
which a function of two or more independent variables shall be said to be of 
bounded variation. Of these definitions six are usually associated with the 
names of Vitali, Hardy, Arzela, Pierpont, Fréchet, and Tonelli respectively. 
A seventh has been formulated by Hahn and attributed by him to Pierpont; 
it does not seem obvious to us that these two definitions are equivalent, and 
we shall give a proof of that fact. 

The relations between these several definitions have thus far been very 
incompletely determined, and there would appear to have been misconcep- 
tions concerning them. In the present paper we propose to investigate these 
relations rather fully, confining our attention to functions of two independent 
variables. 

We first (§2) give the seven definitions mentioned above and a list of the 
known relations among them. In §3 some properties of the classes of functions 
satisfying the several definitions are established. In § 4 we determine, for each 
pair of classes, whether one includes the other or they overlap. In §5 further 
relations are found concerning the extent of the common part of two or more 
classes. We next (§6) give a list of similar relations when only bounded func- 
tions are admitted to consideration; in §7 additional like relations are ob- 
tained when only continuous functions are admitted. We conclude (§8) 
with a list of the comparatively few relations that are not yet fully deter- 
mined. 

2. Definitions. The function f(x, y) is assumed to be defined in a rectangle 
R(asx3b, cSy<d). By the term net we shall, unless otherwise specified, 
mean a set of parallels to the axes: 


y= = 0,1,2,---,m), C= =". 


Each of the smaller rectangles into which R is divided by a net will be called 
a cell. We employ the notation 


* Presented to the International Congress of Mathematicians, Zurich, September 5, 1932, and 
to the American Mathematical Society, April 15, 1933; received by the editors December 22, 1932. 


824 


DEFINITIONS OF BOUNDED VARIATION 


Auf(xi, yi) = Vier) — Yd) — + V2), 
Af(xi, = Vier) — 
The total variation function, ¢(#) [y(¥)], is defined as the total variation of 


y) [f(x, 9) ] considered as a function of y[x] alone in the interval (c, d) 
[(a, b)], or as + © if f(z, y) [f(x, 9) ] is of unbounded variation. 


DEFINITION V (Vitali-Lebesgue-Fréchet-de la Vallée Poussin*). The func- 
tion f(x, y) is said to be of bounded variation if the sum 


m—1,n—1 


t=0,j=0 
is bounded for all nets. 


Derinition F (Fréchet). The function f(x, y) is said to be of bounded varia- 
tion if the sum 


m—1,n—1 


«€Auf(xi, 


i=0,j=0 
is bounded for all nets and for all possible choices of ¢;= +1 and €;= +1. 


Derinition H (Hardy-Krause). The function f(x, y) is said to be of boun’-d 
variation if it satisfies the condition of definition V and if in additiont f(z, y) is 
of bounded variation in y (i.e., o(%) is finite) for at least one & and f(x, 9) is of 
bounded variation in x (i.e., Y(9) is finite) for at least one 9. 


Derinition A (Arzela). Let (x:, yi) (¢=0, 1, 2,---, m) be any set of 
points satisfying the conditions 


Xe Xm = b; 
Ym = 
Then f(x, y) is said to be of bounded variation if the sum 


m—1 


| Af(xs, | 


is bounded for all such sets of points. 


* References to most of the authors mentioned here in connection with the various definitions 
are given by Hahn, Theorie der Reellen Funktionen, Berlin, 1921, pp. 539-547, or by Hobson, Theory 
of Functions of a Real Variable, 3d edition, vol. 1, Cambridge, 1927, pp. 343-347. We need supple- 
ment these only by Tonelli, Sulla quadratura delle superficie, Accademia dei Lincei, Rendiconti, (6), 
vol. 3 (1926), pp. 357-362. 

t In the rectangle R is always to be understood. 

¢ The definition H as originally formulated imposed the two latter conditions for every x and 
every y, respectively, but it was shown by W. H. Young that the three conditions were redundant and 
that the definition could be reduced to the form given here. See Hobson, loc. cit., p. 345. 


825 


826 J. A. CLARKSON AND C. R. ADAMS [October 


Derinition P (Pierpont). Let any square net be employed, which covers the 
whole plane and has its lines parallel to the respective axes. The side of each 
square may be denoted by D, and no line of the net need coincide with a side of 
the rectangle R. A finite number of the cells of the net will then contain points of 
R, and we may denote by w, the oscillation of f(x, y) in the vth of these cells, re- 
garded as a closed region. The function f(x, y) is said to be of bounded variation 
if the sum 


Dw, 


is bounded for all such nets in which D is less than some fixed constant. 


DEFINITION Py (Hahn’s version of definition P). Let any net be employed 
in which we have m=n and yisi—yi=(d—c)/m (i=0, 
1, 2,---+, m-—1). Then there are m® congruent rectangular cells and we may 
let w} stand for the oscillation of f(x, y) in the vth cell, regarded as a closed region. 
The function f(x, y) is said to be of bounded variation if the sum 


is bounded* for all m. 


DEFINITION T (Tonelli). The function f(x, y) is said to be of bounded varia- 
tion if the total variation function $(2) is finite almost everywhere in (a, b), and 


its Lebesgue integral over (a, b) exists (finite), while a symmetric condition is 
satisfied by 

It may be of interest to indicate briefly how a set of definitions seemingly 
so diverse came to be formulated. Definition V, perhaps the most natural 
analogue of that of bounded variation for a function of one variable, is suffi- 
cient to insure the existence of the Riemann-Stieltjes double integral 
f S'g(x, y)d,d,f(x, y) for every continuous function g(x, y). The existence of 
this integral}, when g(x, y) is the product of a continuous function of x and a 
continuous function of y, is also implied by condition F, which is weaker than 
V (see relation (3) below). Definition H singles out a class of functions which 
it is convenient to consider in the study of double Fourier series. Definition 
A, also a rather natural analogue of that of bounded variation for a function 

* It may readily be proved that this is equivalent to assuming that there exists some infinite 
sequence of values of m, say m(k=1, 2, 3,-- ; mes<my), with bounded, for which this 
sum is bounded. 

t In a certain restricted sense; see Fréchet, Sur les fonctionnelles bilinéaires, these Transactions, 


vol. 16 (1915), pp. 215-234, especially pp. 225-227. Several questions concerning these double in- 
tegrals are considered in a forthcoming paper by Clarkson. 


1933] DEFINITIONS OF BOUNDED VARIATION 827 


of one variable, expresses a condition necessary and sufficient that f(x, y) 
be expressible as the difference of two bounded monotone functions.* Defi- 
nition P, or Px, is a natural extension to functions of two variables of the 
notion of bounded fluctuation, to use Hobson’s terminology, which is equiva- 
lent to that of bounded variation for functions of one variable. Condition T 
is necessary and sufficient that the surface z=f(x, y), where f(x, y) is contin- 
uous, be of finite area in the sense of Lebesgue; this definition also is useful 
in connection with double Fourier series. 

For simplicity we shall also use the letters V, F, H, A, P, Pu, and T to 
represent the classes of functions satisfying the respective definitions. The class 
of bounded functions will be denoted by B and the class of continuous func- 
tions by C; a product, such as V-T-C, will stand for the common part of the 
two or more classes named.f 

The only relations that seem to be already known among the several 
definitions may be indicated as followst: 


(1) Pr >A>H, (2) AC>H-C, (3) F>V>G, 
(4) V-C>4H-C, (5) TC>A-C, (6) V-T-C=H-C. 


3. Some properties of functions belonging to these classes.§ We first prove 
the following theorem. 


THEOREM 1. Jf f(x, y) is in class H, the total variation function $(2) 
[¥(5) ] is of bounded variation in the interval (a, 6) [(c, a) ].|| 


Assume the contrary; then, given any M >0, there exists a set of numbers 
x; (¢=0,1,2,---,m) with 


* Monotone in the sense of Hobson, loc. cit., p. 343. 

+ From the definitions the following relations are easily seen: V>V-B, F>F-B, T>T-B, 
H=H-B,A=A-B, P=P-B,and Py=Py:B. 

t For a proof of the relation A =H see for example Hobson, loc. cit., pp. 345-346; the relation 
A>H then follows from an example given by Kiistermann, Funktionen von beschrinkter Schwankung 
in zwei reellen Verdnderlichen, Mathematische Annalen, vol. 77 (1916), pp. 474-481. Since Kiister- 
mann’s example is continuous, it also gives us A -C>H-C. A proof of the relation Py>A is given by 
Hahn, loc. cit., pp. 546-547. From the definitions we clearly have V =H, and the relations V>H and 
V-C>H-C may then be inferred from the example f(x, y) =< sin (1/x)(x+0), f(0, y)=0. That F is 
2V is obvious from the definition; the definite inequality F>V is established by Littlewood, On 
bounded bilinear forms in an infinite number of variables, Quarterly Journal of Mathematics, Oxford 
Series, vol. 1 (1930), pp. 164-174. The relations T-C>A-Cand V-T-C=H- Care stated by Tonelli, 
loc. cit. 

§ Only properties of the total variation functions ¢(#) and ¥(9) are considered here; other 
properties will be examined in a forthcoming paper. 

|| This property is not enjoyed by all functions of class A; indeed it is easily seen (compare 
example (C) below) that f(x, y) may be in A and yet ¢ and y be everywhere discontinuous. It is clear 
that if f(x, y) is in V, [y] is either everywhere infinite or of bounded variation. 


J. A. CLARKSON AND C. R. ADAMS 


and such that 


| o(x:) — > M. 


Consider any two successive points (x;1, ¢) and (x;, c). From the definition 
of ¢(2), there exists a set of points p; on the line x =~,, and their projections 
pj on the line x=2;1, such that we have 


IS | — — X — || = | — | /2. 


Hence for the net N composed of the boundary lines of the rectangle, the 
lines x =x;_, and x=4;, and the horizontal lines through the points 9;, the 
V-sum is 


m—1,n—1 


| Auf(xs, ys) | = | — (x1) | /2. 
j= 
By a repetition of this process for each interval (x;-1, x;) we may prove the 
existence of a net N’ for which the V-sum Vy-(f) is = M/2, thus contradicting 
the hypothesis that f(x, y) is of class H. 
Before proving Theorem 2 we demonstrate the following lemma, due 
essentially to Borel. 


Lemma 1. Let E be a bounded set of positive interior measure and let the 
sequence of functions {f,(x)} defined on E converge to the limit function f(x) 
at each point of E. Then, if € is any positive number and if E,(«€) denotes the 
subset of E where 


f(x) — frlx)| >¢, 
we have 
lim m;E,(e) = 0. 


no 


Since m;E,(e) is the least upper bound of the measures of the measurable 
subsets of E,(e), it suffices to show that if { EZ,’ (e)} is any sequence of meas- 
urable sets contained respectively in {£,(e¢)}, then lim,... mE, (€) is 0. This 
will be true if E*, the complete limit of { Z, (€) }, is of measure zero; but E* is 
a null set, since the sequence {f,(x)} cannot converge at any point of E*. 


t See Borel, Lecons sur les Fonctions de Variables Réelles, Paris, 1905, p. 37, where essentially 
this lemma is indicated but not proved. 


is 


19321 DEFINITIONS OF BOUNDED VARIATION 829 


THEOREM 2. If a function f(x, y) is in class Py, and E is the set of points 
&[9] in the interval (a, b) [(c, d)] for which o(2) [W(5)] is infinite, then mE 
is zero.t 

In particular, if EZ is measurable (as it would be if for example f(x, y) 
were continuous; cf. Theorem 4), it is of measure zero. 

To prove Theorem 2, assume f(x, y) is in Py and that E, the set of points 
# for which ¢(Z) is infinite, is of positive interior measure. On E define the 
sequence of functions f,(x) as follows. For a fixed m, let R be divided by a net 
N into n? congruent rectangles, and at the point < of £ let 


gn(Z) = [oscillation of f(%, y) in the interval y 
t=1 


Let f.(#) =1/g,(z). Then lim,.. f,(x) is 0 at each point of Z, and hence at 
each point of EZ’, some arbitrarily selected measurable subset of E of positive 
measure. 

Let «€>0 be given. By Lemma 1 there exists an 7 such that m,E,(e) is 
<e, where E,(¢) is the subset of Z’ on which |f,(x) | is >. Consider the net 
N which defines f,(x). Let \ be the number of columns of WN in which points 
of the set E’—E,(e€) occur. We have 
(a) — E,(e)] (b — a)/r. 

From the relations 
— E,(e)] = mE’ — m.E,(e) and mE,(e) <« 
we have 
m.|E’ — E,(e)] > mE' —.«, 
and hence by (a) 
> r(mE’ — «)/(b — a). 


But in each of the A columns of WN which contain points of the set E’— E,(€) 
the sum of the oscillations of f(x, y) in the several cells is at least 1/e. Hence 
for the net N we have 


1 
— Lor = (er) > (mB’ — — 


Since mE’ is >0, this last quantity increases indefinitely with 1/¢, while if 
f(x, y) is in Py the sum on the left must be bounded. 


t If f(x, y) is in A, ¢[y] is clearly bounded. 


830 J. A. CLARKSON AND C. R. ADAMS [October 


The part of the theorem concerning ¥() may of course be demonstrated 
in the same manner. 

We may note here that the set E of points & for which (2) is infinite may 
nevertheless be everywhere dense in the interval (a, b), as the following example 
shows. Define the function f(x, y) on the unit square I(0S%<1, O<y<1) 
as follows. Let the rational points of the segment 0 <x <1 be enumerated and 
designated by %1, %2, %3, On each line define f(x, y) as 1 
for y irrational and >1—1/2/, and zero otherwise. When zx is irrational let 
f(x, y) be zero for all y. For convenience we denote by S; the segment 1 —1/2/ 
<y<1 of the line x=x;. 

Clearly f(x, y) is of unbounded variation in y for each fixed rational x, 
and these points are everywhere dense in the interval (0, 1). But f(x, y) is in 
Py. Forconsider any square net of n? cells on I. In all cells of such a net 
except for those which contain more than one point of some segment S;, 
the oscillation is zero; in the remainder the oscillation is 1. Let M be the 
number of the latter. Then M is at most equal to Mi+M2+M3+ --- 
+M,-+n, where M; is the number of cells containing more than one point 
of S;, and ? is the largest integer for which 1/2? exceeds 1/n. But M,;; is less 
than 2+2/2‘-'; hence we have 


n2 
Do! = M/n<5 
and f(x, y) is in Py. 
As a preliminary to the proof of our third theorem we shall first establish 
another lemma. 
Let A denote any set of & real numbers, 


A: G2, 23,°°* ak, 


and let 6=)>>_, |a;|. With this set we may associate 2* sums of the form 


+ a, t+ dg t agt---t 


These sums occur in 2*-' pairs, of opposite sign, +.S; (j=1, 2, 3, - - - , 2*-), 
the subscripts being assigned arbitrarily. Let S; (j=1, 2, 3, ---, 2*-) be 
that one of the jth pair which is positive, or zero if each sum in the pair 
vanishes. Denote by the sum 


Lemma 2. We have where 


k!/(2[(k/2)!]?) for k even, 


1933] DEFINITIONS OF BOUNDED VARIATION 831 


Since we shall make use of this result only for & odd, and a similar proof 
can be given for k even, we confine ourselves to the 

Proof for k odd. Without loss of generality we may assume the a; to 
be non-negative, since both @ and )-A are invariant under the change of 
sign of any 4. 

In the particular case in which all the a; are equal we have >A = M0. 
For let [S;], (4=0, 1, 2, - - - , (R—1)/2) denote in this case the set of expres- 
sions of the form +0/k+0/k+0/k+ --- +6/k in which exactly hk minus 
signs occur. Then each S; in [S;], has the value (k—2h)0/k . In [S;], there 
will be exactly (j,) sums S;. Hence, adding, we obtain >A = M0. 

We wish to show that in every case ),A =>M,0. Let A be any set and let 
a’ and a”’ be any two elements of A. Let S/ (j=1, 2,3,---+, m) be the 
2*-8 sums obtained from the set composed of the remaining elements of A. 
Then the 2*-' sums S; may be written in an array of four columns thus: 


+a’'+a",|S{ —a’— a" , | Sf +a’—a"|,|Si —a’+a"|, 
Si +a’+a",|S{ —a’ —a" , | SE +a’—a"|,|Si —a’+a"|, 


Si +a’ +a”, |, £4 +a’—a’"|, |S —a'+a"| 
If we denote by Ci, C2, Cs, and C, the sums of the respective columns, we have 


>A =Ci+C2+Cs3+C;. By comparison with the sum obtained when absolute 
value signs are omitted from the third and fourth columns, we have at once 


dA 
j=1 
But if in the set A we replace a’ and a”’ each by (a’+a’’)/2 to form the set 
A’, we see that >(A’ is precisely the right-hand member of this inequality. 
Therefore, if in a set A any two elements are each replaced by their arithmetic 
mean, >.A is not increased. 

Now assume the existence of a set A of k elements with +2 ee a;=0 
(and hence with arithmetic mean 6/k), and with ).A =M,0—6, where 4 is 
some positive number. Let & be the absolute value of the greatest deviation 
from 0/k of any one a;. There are a finite number of the a;, then, whose devia- 
tion in absolute value exceeds £/2. Select one such and pair with it some ele- 
ment whose algebraic deviation is of the opposite sign. Replace each of these 
by half their sum. By repeating this operation we form the set A, with the 
same arithmetic mean 0/k, for which the greatest deviation from the mean 
does not exceed £/2, while }-A: is <)_A. This process may be repeated as 
many times as we may desire, to yield a set A, whose elements deviate from 


| 
| 
| 
| 
| 
a 
4 
j 
| 
i 
4 
4 
4 


832 J. A. CLARKSON AND C. R. ADAMS [October 


the mean by as little as we wish. But as }_A is evidently a continuous func- 
tion of the elements of A, for sufficiently large » we must have both 


| — <6 and dA = Mo 
From this contradiction follows Lemma 2 for k odd. 
We may now prove 


TuHeEorEM 3. If f(x, y) is in class F, the total variation function (2%) [y(4) | 
is either everywhere infinite, or is bounded and integrable in the sense of Riemann* 
over the interval (a, b) [(c, d) }. 

Let f(x, y) be in class F, and for some x (aS%oSb) let $(x%o) =Mi, a 
finite number. Consider the function f’(x, y)=f(x, y)—f(b, y). Clearly 
f' (x, y) is also in class F. 

Let x; (@<%,<5) be any number distinct from xo, and let (1, y;) (¢=0, 1, 
2, ---,m) beany set of +1 points on the line x= with 


C= 
We have 
f(x, = f(xo, y) + — 9), 


whence 


| ys) — | 


= > | S(x0, yi) + vs) — f’(%0, — ¥i-1) — 
+ f’(x0, yi-1) | 
= DI yi) — | + > | Auf’ (xo, yi-1) | 


Since f’(x, y) is in F, there exists a number M; such that we have 


m—1,n—1 


t=0,j=0 
for any net. But 7%. |Auf’(xo, y:-1) | is the sum of the absolute values of 
the differences Auf’(x;, y;) in one column of cells of the net composed of the 


* It is easily seen from the proof that discontinuities of ¢[y] can occur only at a denumerable 
set of points; indeed, for any e>0, the number of points at which ¢[y] has a saltus > is finite. It is 
appropriate to remark also that example (E) below shows that ¢[y] may be bounded but not of 
bounded variation. 


1933] DEFINITIONS OF BOUNDED VARIATION 833 


four vertical lines x=a, x=%0, X=%1, x=b and the m+1 horizontal lines 
y=y; (¢=0, 1, 2, - - - , m). Hence, a fortiori, it is less than M2 and we have 


sa, yi) — f(x, yi-1) | < Mi + M3; 


thus ¢(#) is bounded by the latter number. 

Now assume that ¢(%) is bounded but not integrable in the Riemann 
sense. Then £, the set of points in the interval (a, ) at which (2) is dis- 
continuous, must be of positive exterior measure. Let E, be the subset of Z 
such that at each point of E, the saltus of $(%) exceeds 2/n. Then E is 
>=: En, and so for some fixed k we must have m,E,>0. Let x: be any point 
of E; and let A, be an interval of length not exceeding 3m,.E;, with center 2. 
Within A; there must be a point x such that |¢(x.) —¢(x/) | >1/k. We may 
assume without loss of generality that (x) is >¢(xi). Let m and M be 
any constants satisfying the inequalities 


M—m>1/k. 
Then there exists a set of points on the line x=”, 
x1, c), Pi, a), 


and their horizontal projections g; (¢=0, 1, 2,---, r) on the line x=xy, 
such that we have 


| — > M, f(a) — f(qi-1)| < m, 


i=1 


and hence 


(b) — f(pi-1) — + f(qi-s) | >M—m>1/k. 


Now consider the net NM; on R consisting of the four vertical lines x=a, 
*=2,,x=x{,x=b, and the r+1 horizontal lines through the points p; (t=0, 
1, 2,---+, 7). From (b) it is seen that the sum of the absolute values of the 
terms Ay f(x, y;) associated with the single column of cells which stands on 
the interval (x, x/) exceeds 1/k. 

Since the length of A; is <m,E;,/2, there is a point x2 of E, exterior to Ai. 
Surround 2 with an interval A, of which it is the center, of length not exceed- 
ing m.E,/4 and small enough so that it does not overlap A;. Proceeding as 
before, we prove the existence of a second net N2 of which one column of 
cells possesses the property that the sum of the absolute values of the terms 
Anf(x;, y;) associated with it exceeds 1/k. It is clear that the net composed of 


j 

i 
' 
i 
| 
1 
A 
| 
i 


834 J. A. CLARKSON AND C. R. ADAMS [October 


all the lines in both N, and N; has two distinct columns of cells each possess- 
ing this property. 

This process may be repeated indefinitely, and so we see that there exists 
a 8>0 such that, given any integer &, there exists a net on R in at least k 
columns of which the sum of the absolute values of the differences Auf(xi, y;) 
in the several cells of the column exceeds 6. We proceed to show that under 
these conditions the sum 


m—1,n—1 


Auf(x, 
i=0,j=—0 
may be made arbitrarily large by proper choice of the net N and the ¢,’s 
and €,’s. 

Let & (taken odd for convenience) be given, and let N be a net, of n rows 
and m columns of cells, such that in at least k columns the above condition 
is satisfied. Consider the matrix || a,;|| for which ai;= An f(x1, y;-1) and 
in which all the a;; but those arising from the k columns noted above are 
suppressed. This matrix has, then, ~ rows and & columns; renumbering 
the columns consecutively, we have 


Gin 


@i1 


with >> lai; |>0 (¢=1,2,3,---,&). 
If it can be shown that the sum 


kn 
F’ = 6,6 (| | | 5; | = 1) 


| 


for some choice of the 6,’s and 6,’s is arbitrarily large with k, the proof will 
be complete, since max Fy(f) is =>max F’. 

The 6,’s may be chosen in 2*~ essentially distinct ways. Let S,, represent 
the absolute value of the sum yielded by the jth row of || a;;|| with the pth 
such choice, and let 

Fy = DSip (p = 1, 2, 3,---, 
j=l 
Then each F, is a particular value of F’ corresponding to some choice of the 
5,’s and 6,’s. We may write 
2k—1 n 2-1 
p=1 p=1 


j=1 


Ake 
| 


1933] DEFINITIONS OF BOUNDED VARIATION 


and by Lemma 2 we have 


k 
Dd Sin = Mid; (j = 1, 2, 3,---, m), where 0; = >>| a:;|, 
p=1 


whence 


k! 
1 


since k is odd. It follows that at least one F’ exceeds 
k! 


By Stirling’s formula we see that this quantity increases without limit with 
k; therefore the sum Fy(f) can have no bound, contrary to the hypothesis 
that f(x, y) is in class F. Thus Theorem 3 is proved. 


TueoremM 4. If f(x, y) is in class C, then $(%) and Y(¥) are lower semi- 
continuous functions.* 


= = Miko = 0, 
j=l 


Let the interval (c, d) be divided into 2* equal parts by the numbers 


Yo = C, =A 
and set 


Since f(#, y) is continuous in y, we have lim,.. $:(#)=@(#), and since 
ys) (i=0, 1, 2, - - - , m) is continuous in ¢,(#) is a continuous function 
of @. Moreover the sequence {¢,(#)} is non-decreasing; hence $() is lower 
semi-continuous.f Similarly ¥(¥) is of like character. 

4. Relations between pairs of classes. We shall establish the following: 


(7) P = Pu, (8) T> 4d, (9) VEP, PY, 
(10) AZ V,V >A, (11) V2 7,T >V, (12) AZT, T >A, 
(13) PE 7T,T > P, (14) F > V, (15) F2 A, A >FP, 
(16) F P,P >F, (17) F27,T >F. 

* It is apparent from the proof that the hypothesis that f(x, y) be continuous in each variable 


separately is sufficient here. 
Tt See, for example, Hobson, loc. cit., 2d edition, vol. 2, 1926, p. 149. 


835 
2-1 n 
LF | 
p=1 
4 
| 
4 
bic 


836 J. A. CLARKSON AND C. R. ADAMS [October 


Proof of (7). Let us first assume f(x, y) to be in class P and consider any 
net of n? cells as in definition Py. Without loss of generality we may suppose 
d—c2b—a. Then there exists a net of square cells as used in definition P for 
which we have D=(b—a)/n and whose vertical lines include x =a and x=). 
No square of the P net can overlap more than two cells of the Px net; hence 
we have 


2 


which is bounded. This establishes the relation P < Py. 

Now assume f(x, y) to be in Py. Again let us suppose d—c2=b—a, and 
consider any square net N, as used in definition P, for which we have 
D<(b—a)/2. Let m be the largest integer satisfying the inequality (b—a)/n 
=D; then we have 


b-—a 


and therefore 
(c) D s (b — a)/n < 2D. 


Consider now the net N’ of n? cells as used in the Py definition. From the 
second part of (c) it is seen that one cell of N’ can overlap no more than three 
columns of cells of N. The height of one cell of N’ is (d—c)/n, and if p is 
the smallest integer satisfying the inequality p=(d—c)/(b—«a), we have 


(d — c)/n S p(b — a)/n < 2pD; 
hence one cell of NV’ can overlap no more than 2p+1 rows of cells of N. Thus 
we have 
Dw, 3(2p + 1) < 3(2p + 1)(b — a) /n, 


which is bounded. This establishes the relation Py <P, and we conclude the 
identity of the two classes. 

Proof of (8). It was shown in §3 that if f(x, y) is of class H, then the total 
variation functions ¢(2) and ~(¥) are of bounded variation. From this follows 
at once T =H. Then from the example* 


0,*x<y 


in I, the unit square, 
» 


(A) f(x, 9) = { 


which is in T but not H, we infer (8). 
* See Hahn, loc. cit., p. 547. 


b-—a b-—a b-—a 
—  <Ds— and — < 2— < 2, 
n+1 n n n+1 


1933] DEFINITIONS OF BOUNDED VARIATION 837 


Proof of (9). The first part follows from example (A), which is in P but 
not V. Example 


x sin (1/x), x #0 
in I, 


B) fle n= {> 


which is in V but not P, establishes the second part. 

Proof of (10). The first of these relations follows from the second of (9). 
By taking sets of points (x;, y;) along the perimeter of the rectangle R, one 
sees immediately that a function of class A must satisfy the two latter con- 
ditions of definition H. Since there exist functions which are in A but not H, 
and by the last remark these must fail to be in V, the second of relations (10) 
follows. 

Proof of (11). The first of these relations is shown by example (A), which 
is in T but not V, while example (B) shows the second. 

Proof of (12). Example (A) establishes the first relation. The second is 
a consequence of the following example. 

(C) Let E be a non-measurable set in the interval 0 <x <1, and let EZ’ be 
the set of points on the downward sloping diagonal of I whose projection on 
the x-axis is E. Define f(x, y) as 1 at all points of Z’ and zero at all other points 
of I. Then clearly f(x, y) is in A; but it is not in T, since ¢(Z) is not measur- 
able. 

Proof of (13). The first relation follows from example 


1 for x and y both rational 
in I, 


Ks = 0 otherwise 
which is in T but not P. The second follows from the second of (12). 

Proof of (14). It has already been remarked that this relation, which is 
included in (3), has been established by Littlewood. His proof, however, de- 
pends upon the theory of bilinear forms in infinitely many variables; it may 
therefore be of interest to show how an example of a function which is in 
class F but not V can be constructed directly. Moreover, we can easily de- 
termine whether our example belongs to the classes P, A, and T; consequently 
it may be expected to be useful in proving other relations later. 

We first make a preliminary observation. Consider a function f(x, ¥) 
defined in R. For any net N let max Fy(f) denote the maximum value which 
the sum 


m—1,n—1 


Fr(f)= e@Anf(xi, yi) 


i=0, j=0 


associated with N may be made to assume by a suitable choice of the «,’s 


a 
5 
| 
| 
4 
4 
q 


838 J. A. CLARKSON AND C. R. ADAMS [October 


and €,’s. If an additional line, horizontal or vertical, be added to N to form 
the new net NV’, we have max Fy(f) <max Fy-(f). For, suppose a horizontal 
line be added. Then one row of cells of N is replaced by two new rows of cells 
of N’; and if the two @’s associated with these rows in the sum Fy-(f) be both 
assigned the same value as the € associated with the single replaced row in the 
sum Fy(f), and all the remaining e’s and é’s given identical values in the two 
sums, we have Fy(f) =Fy-(f), from which the above observation follows. 

By a “point-rectangle function” we shall mean a function f(x, y) defined 
on R as follows: f(x, y) = +1 (or some other constant) on each of a rectangular 
array of points p;; in R, where the rows are equally spaced with each other 
and with the lines y=c, y=d, and the columns likewise, and ;; is the point 
standing in the jth row and ith column of the array; f(x, y) =0 at all other 
points of R. Let max F(f) denote the maximum value which the expression 
Fy(f) can attain for all possible nets and choices of the e’s and é’s, and max 
V(f) be the maximum value which the sum 


m—1,n—1 


i=0,j=0 
can attain with all possible nets V. We consider the problem of determining 
the value of (max F(f))/(max V(f)) for such a function. 

Clearly max F(f) is attained by the use of a net consisting of one line 
through each row and column of points and one between every two rows and 
columns, together with the lines forming the boundary of R, since by our 
preliminary remark the net obtained by omitting any lines cannot yield a 
larger sum, and adding any line is extraneous as it merely introduces an ad- 
ditional row or column of cells each of which contributes zero to the sum. The 
position of the intermediate lines of the net is immaterial. 

Let N, then, be such a net on R, and consider next the problem of choosing 
the e’s and é’s so that Fy(f) is a maximum. 

Form the related matrix 

Gin 


Qi2 


where a;;= Let 


F’ = 8,6 


Suppose the 4’s and 4’s to be so chosen that max F’ is attained. To a particular 


(1 4] = = 0. 


1933] DEFINITIONS OF BOUNDED VARIATION 839 


5, of the sum F’ there correspond two consecutive ~’s of the sum Fy(f); 
namely, those attached to the two rows of cells of N whose top and bottom 
edges, respectively, pass through the sth row of points p. If 5, is positive, let 
these be assigned positive and negative values respectively, while if 5, is 
negative let their signs be fixed in the reverse order. Let all the e’s and @’s be 
determined in this manner. 

This choice will make Fy(f) assume its maximum. For it will be seen that 
if a particular term 6;4,a;; has the value +1, then the four cells of N which 
have the point ;; in common will together contribute +4 to the sum Fy(f), 
while if this term has the value —1, these cells will contribute —4; so by this 
choice we have Fy(f) =4 max F’. Suppose now that by some other choice of 
e’s and @’s we should have Fy(f) >4 max F’. If for some k we have €2441=€2x, 
the two rows of cells of N to which these €’s are attached contribute zero to 
the sum Fy(f); and such contribution as these rows make when €4.41= +1 
and €:,=—1 is minus that which they make when the values of the €’s are 
interchanged. Hence if by amy choice it be possible to make Fy(f) >4 max F’, 
there must exist such a choice in which the ¢’s and ©’ occur by pairs with 
different signs. But in this case we may, by reversing the process above, choose 
the 4’s and 6’s in the sum F’, and it will be seen that with this choice we have 
4F’ =Fy(f) >4 max F’, which is a contradiction. Hence max Fy(f) =4 max 
F’, and since, as previously remarked, we may attain the maximum of F(f) 
by using the net V, we have 


max F(f) = 4max F’. 


If now we denote by max V’ the sum 


| asl, 


we easily see that ym 
max V(f) = 4max V’, 
and hence 
maxF(f) maxF’ 
max V(f) max V’ 


We proceed to show that given any ¢>0, there exists a matrix || a;,|| 
with elements +1 for which (max F’)/(max V’) is <e. This being so, we 
may then assert the existence of a “point-rectangle function” f(x, y) for which 
(max F(f))/(max V(f)) is <« for any preassigned e>0. 

Consider the matrix 


| 

4 

| 

4 

1 
mn | 

4 

| 

i 

ic 


J. A. CLARKSON AND C. R. ADAMS [October 


Gin Gan*** 


= 


in which all the elements of the bottom row are +1, and the rest of the va- 
rious columns consist of the 2*-" possible ordered sets of n—1 elements each 
equal to +1 or —1. Consider max F’ for this matrix. Let the 4’s be assigned 
in any arbitrary way; then in order to make F’ as large as possible, choose the 
5’s so that the total sum contributed by each column shall be positive. If 
this be done, we see that 


1 column of elements contributes , 
n columns of elements contribute »—2 each, 


n(n — 1) 
——— columns of elements contribute ” — 4 each, 


((m — 1)/2)! 


columns of elements contribute 1 each, 


and hence 


n(n — 1) 


n(n — 1)(m — (=) 


2 
((m — 1)/2)! 


Moreover this value is independent of the choice of the 6’s, so that max F’ 
equals this expression. Clearly we have max V’=n2"-!. Thus for matrices 
of this type, we have 


+ 


max F’ 
im = 0, 
max V 


since the expression for (max F’)/(max V’) reduces to 


((m — 


840 


1933] DEFINITIONS OF BOUNDED VARIATION 


which by Stirling’s formula is O(1/n/?), and so tends to zero with 1/n. 

We now construct example 

(E), a function in class F but not V. Let I, the unit square, be divided into 
quarter squares, and let S, be the upper left-hand quarter square. Next di- 
vide the lower right-hand square into quarter squares, and let S, be that 
quarter which has a common vertex with Si, etc. We obtain in this way an 
infinite sequence of square subdivisions of I converging toward the point 
(1, 0). Now in S; let a “point-rectangle function” be defined for which we have 
max V(f) =1 and max F(f) <1/2; thus if the set of points ~;; contains m,2"-! 
points, f(p;;) is +1/(4m2™-") for each i and j. Similarly, in each S;, define a 
“point-rectangle function” for which max V(f)=1, max F(f) <1/2/. At all 
remaining points of I let f(x, y) be zero. 

It is readily seen that this function is not in V. Consider any net N on I, 
and let S; be the last S; through which lines of this net pass. By adding a 
sufficient number of lines to insure the largest possible contribution from 
each S;(j<k), which cannot decrease max Fy(f), we see that for the net V 
we have 


1 1 1 
F <— eee —— 
maxFy(f) <> +5 


Hence f(x, y) is in class F, and relation (14) is established. 
Proof of (15). The first part follows from example 


1 on main diagonal (through (0, 1), (1, 0)) of I, 


0 elsewhere in I, 


which is in A but not in F. The second part may be deduced from example 
(B). 

Proof of (16). Examples (F) and (B). 

Proof of (17). Examples (F) and (B). 

5. Relations concerning the extent of the common part of two or more 
classes. We first establish the following relations involving a single class on 
the one hand and the product of two classes on the other*: 


(18) H=A-V, (21) HAH<A-T, 
(19) H=V-T, (22) H<P-T, 
(20) H = P-V, (23) H<A-F, 


* From this list are intentionally omitted all relations such as P>P- F, in which the class on the 
left appears also on the right; the inequality is definite in the light of relations (9)—-(13), (15)-(17). 
From this and all subsequent lists all relations involving “reducible” products (such as P- A which 
reduces to A by (1), and P- V which reduces to H by (20)) are also omitted. 


841 
i 
4 
3 
4 
4 
i] 
ay 


842 J. A. CLARKSON AND C. R. ADAMS 


(24) H < P-F, (30) VEP-T,P-TYYV, 
(25) H <F-T, (31) FLA-T,A-T YF, 
(26) T>A-F, (32) P-T,P-TYF, 
(27) T > P-F, (33) VLA-F,A-F PV, 
(28) AZXP-T,P-T >A, (34) ViP-F,P-F +V, 
(29) VZA-T,A-T YY, (35) VZF-T,F-T 

(36) AZF-T,F-T >A. 


Proof of (18). By (1) and (3) we have HS A-V. But a function of class V 
satisfies the first condition of definition H, and a function which is in A satis- 
fies the two latter conditions of H. Hence we have A-V <H, and (18) 
follows. 

Proof of (19). From (3) and (8) follows H<V-T. But if f(x, y) isin V-T, 
it satisfies the first condition of definition H, and by definition T the total 
variation functions ¢(#) and ¥(#) must be finite almost everywhere; thus 
we have V-7T <H, and hence (19). . 

Proof of (20). The relation H<P-V follows from (1) and (3). But if 
f(x, y) isin P-V, by Theorem 2 the functions ¢(#) and ¥(4) are surely finite 
for at least one point in their respective intervals; and as the first condition 
of definition H is also satisfied, we have P- V <H, and hence (20). 

Proof of (21). From (1) and (8) we obtain H<A-T. From example (F), 
which is in A-T but not H, (21) is inferred. 

Proof of (22). The relation H=P-T is a consequence of (1) and (8). 
Example (F) then establishes (22). 

Proof of (23). By (1) and (3) we have H<A-F. Then consider example 
(E). That function was shown to be in class F; it is, moreover, in class A. 
For let (x, yi) be any set of points as used in definition A. Then f(x;, 4) 
vanishes at all these points excepting at most those which lie within one 
square S;. For this set of points we have 


where 2;2"-' is the number of points in the array p;; used to define the 
“noint-rectangle function” in S;. But as this expression is bounded, and indeed 
approaches zero with 1/m;, f(x, y) is in A. Hence f(x, y) is in A-F, but since 
it is not in V it cannot be in H, from which fact (23) follows. 

Proof of (24). This is implied by relation (23). 

Proof of (25). By (3) and (8) we have H<F-T. Example (E) is clearly 


1933] DEFINITIONS OF BOUNDED VARIATION 843 


in class T, since ¢(Z) and ¥() are zero except for a denumerable set of points; 
and since it is also in F but not in H, we infer (25). 

Proof of (26). By Theorem 3, if f(x, y) is in class A-F, $(#) and ¥(4) 
must be bounded and integrable in the sense of Riemann, from which we 
have T=A-F. Then relation (26) follows from relation (12). 

Proof of (27). By Theorems 2 and 3, if f(x, y) is in class P-F, #(#) and 
y(#) are bounded and integrable in the Riemann sense, whence follows the is 
relation T2P-F. From relation (13), relation (27) then follows. ‘ 

Proof of (28). The first part follows from example (A), which is in P-T 
but not A. The second part is implied by the second of the relations (12). 

Proof of (29). The first of these relations follows from example (F); the 
second from the first of relations (10). i 

Proof of (30). Example (F); relation (11). j 

Proof of (31). Example (F); relation (15). 

Proof of (32). Example (F); relation (16). i 

Proof of (33). The first relation is shown by example (E), which has been | 
proved to be in F and A, but not in V. The second part is a consequence of | 
example (B). 

Proof of (34). Example (E); relation (9). 

Proof of (35). Example (E); relation (11). 

Proof of (36). The second part of this relation is a consequence of (12). 
To establish the first part we shall now exhibit a function which is in F-T but 
not A. 

As a preliminary step we define a matrix ||q;;|| in the following manner.* 


Let 
ay = a4; = 0 (i,j = 1, 2, 3,---,# +1), 


and determine the remaining elements by assigning the values of i 
Aij = — — (4,7 = 1, 2, 3,---, m) i 

to satisfy the conditions 
| ij] = 1 (i,j = 1, 2,3,-++, m), 


= Ofori’ # i, = 0 for j- 
t=1 


j=l 


It is known that there exist such orthogonal matrices ||A,,|| for an infinite se- 
quence of values of m. For such a matrix the sum 


* We gratefully acknowledge our indebtedness to the late Dr. R. E. A. C. Paley for the construc- 
tion of this matrix. 


| 
n n 
i 
4] 
‘ 
; 
4 
i 


J. A. CLARKSON AND C. R. ADAMS [October 


F= Dek Ai; = (| 1), 


is O(n*/?), since the matrix ||Aj,|| is also orthogonal, and by Schwarz’s 
inequality we have F 
Let di;=4;,;4:—@:;; then we have 


I 
dr; = DiAi;, 
i=1 


and the sum 


j=1 j=] 
where 


€; = sgn 


is also readily seen to be O(m*/*) by a second application of Schwarz’s in- 

equality. This sum represents the “total variation” in the Jth row 

(I=1, 2, 3, --- , m+1) of ||a,,||. It is evident from symmetry that the total 

variation in the Jth column (J =1, 2, 3, - - - , m+1) of ||a;,|| is also O(n*/*). 
Now for each j consider 


max | n+1); 
I 


let J; be the least value of J for which this maximum is assumed. If the num- 
bers J; do not increase monotonically with j, the columns of ||A,,|| may be 
re-arranged so that they do; this will clearly leave undisturbed the properties 
of the matrix ||A,,|| described above. Numbering the rows of both matrices 
||a,,|| and ||A,,|| from bottom to top we observe that the sum 


| 

j=1 
is part of what may be thought of as an Arzeld-sum for the matrix ||a;,||. But 
this sum, and therefore the maximum Arzela-sum for the matrix, will not be 
O(n*/?) if the matrix ||A;,|| be defined thus*: 


* The orthogonality of this matrix is shown by Paley, On orthogonal matrices, Journal of Mathe- 
matics and Physics of the Massachusetts Institute of Technology, vol. 12 (1933), pp. 311-320. That 
|dz;,;| is not O(n*/2) is indicated by Paley, Note on a paper of Kolmogoroff and Menchoff, forthcoming 
in the Mathematische Zeitschrift, 


844 
n n 
where 
n 
j=l 
n n 


1933] DEFINITIONS OF BOUNDED VARIATION 


+1 for i = j; 


x(i— j) fori¥ j,i 0,7 ¥0; 
-—1 fori = Oorj = 0 but 7 ¥ j; 


where x(m) is a real primitive Dirichlet’s character to the prime modulus 9, 
with p=n—1=3 mod 4. 

We may now construct example | 

(G), a function in F-T but not A. Let {S,} (k=1, 2, 3, -- +) be an in- 
finite sequence of square subdivisions of the unit square I similar to that em- 
ployed in example (E) but converging toward the point (1, 1). By the above 
discussion there exists a matrix lo Il, of , rows and m, columns, for which 
F and the total variation in each row and column is <1/2* while the maxi- 
mum Arzela-sum is >1. In S; (k=1, 2, 3, ---) let ry be a square array of 
nz points, with rows and columns equally spaced. The points r? then de- 
termine a set of square cells. In the cell whose vertices are pr? Poors, yA j 
and $21,441, including its boundary, let f(x, y) =a‘ at each point except 
along the top and right-hand sides. At all other points of I let f(x, y) =0. The 
function f(x, y) is then in both F and T but is not in A. 

We list the following additional relations and indicate briefly the proof of 
each*: 


(37) A-T<P-T, (38) A-TEF-T,F-T ¥A-T, (39) A-F < A-T, 
(40) A-F<P-T, (41) A-F<F-T, (42) P-F < P-T, : 
(43) A-F=A-F-T, (44) P-F = P-F-T. 


Proof of (37). Relations (1) and (28). ; 

Proof of (38). Relations (36) and (31). ; 

Proof of (39). Relations (26) and (31). 

Proof of (40). Relations (1), (26), and (32). f 

Proof of (41). Relations (26) and (36). i 

Proof of (42). Relations (27) and (32). ‘ 

Proof of (43). Relation (26). 

Proof of (44). Relation (27). 

6. Relations between classes when only bounded functions are admitted 
to consideration. Reasoning similar to that of the preceding sections readily q 
shows that each of the forty-four relations given above remains valid if 3 
bounded functions alone are considered. We thus have the further results{: 


* Relations (43) and (44), together with (19), show that there are no irreducible products of 
three classes; hence such products need no further consideration. 
ft Numbers are used here and later to correspond with those of similar relations given above. 


845 
| 


846 J. A. CLARKSON AND C. R. ADAMS 


(3b) V-B> H, (8b) T-B>H, 
(9) P,P >V-B, (10b) 4 = V-B,V-B >A, 
(11b) V-BZT-B,T-BYV-B, (12b) 42 7-B,T-B >A, 


and so on. 

7. Relations between classes when only continuous functions are ad- 
mitted. We establish the following set* of relations: 

(ic) T-C>P-C>A-C>H-C, (9°) P-CZV-C,V-C > P-C, 
(10c) A-C£V-C,V-C >A-C, (11c) T-CZV-C,V-C > T-C, 
(14c) F-C > V-C, (15c) A-CZF-C,F-C¥A-C, 
(16c) P-CZF-C,F-C P-C, (17c) T-CZF-C,F-C +T-C, 
(23c) H:C < A-F-C, (24c) H:C < P-F-C, 

(25c) H-C <F-T-C, (33c) A-F-C,A-F-C¥V-C, 
(34c) P-F-C, P-F-C > V-C, (35c) vcz F.T-C,F-T-C >V-C. 

The proof of (1c) will be given in three parts. 

Proof of A-C>H.-C. This relation was established by Kiistermann, loc. 
cit., who gave an example of a continuous function which is in class A but 
not H; a simpler example is given by Hahn, loc. cit. The following function, 
example ; 

(H), will be found to exhibit the same property; moreover one may easily 
determine whether it is in classes F and T. Let S;, Sz, S3, - - - be an infinite 
sequence of square subdivisions of I converging toward the point (1, 0) as 
defined in example (E). In each S; let f(x, y) be defined by the surface of a 
regular square pyramid whose base is S; and height 1/7, and let f(x, y) vanish 
over the rest of I. Then f(x, y) is continuous; it is not in V and hence not in 
H. For if a net N be defined whose lines consist of the lines through the sides 
of the squares S;(j=1, 2, 3,--- , ®) and lines horizontally and vertically 
through the centers of these squares, for this net we have 


Vu(f) = 4 D1/n, 


which may be arbitrarily large. But f(x, y) is in class A. For if (x;, y;) be any 
set of points as used in the Arzela definition, f(x;, y;) vanishes except at such 
points as lie within one square S;, and we have >,| Af(x;, yi)| <2. 

* The correspondents of certain earlier relations do not appear here, since TC includes both 
P+Cand A-C, whereas T includes neither P nor A. We omit also all relations such as T-C>T-F-C, 


in which the class on the left (other than C) appears also on the right; in such relations the inequality 
is always definite by virtue of the relations (9c)-(11c) and (15c)-(17c). 


[October 
n=1 


1933] DEFINITIONS OF BOUNDED VARIATION 847 


This function is in class T, since ¢(#) and ¥(4) are continuous. It is not 
in class F; for if a net N as defined in the preceding paragraph be used, the 
e’s and €,’s can be chosen so that Fy(f) = Vw(f), which is arbitrarily large. 

Proof of P-C>A-C. We clearly have P-C2A-C. To remove the possi- 
bility of equality consider example 

(I), a function defined in I in precisely the same manner as example (H) 
except that the sequence of subsquares {5S;} shall in this case converge 
toward the point (1, 1); i.e., example (I) is obtained from example (H) by 
changing the position of the x- and y-axes. As the function was in class P 
before the change, the new function is clearly in that class also; it is easily 
seen to be in class T but not in classes H, A, V, or F. 

Proof of T-C>P-C. We first establish the relation T-C=P-C. 

Assume f(x, y) to be of class P-C in R and suppose >”, w//n<M. Let 
the sequence of functions {¢,(#); be defined in the interval a< #<b as 
follows: for a fixed m and &, 


where yo=c, yn =d, and y;—yi-1 =(d—c)/2* (t=1, 2, 3, - - - , 2). Then each 
¢.(£) is continuous, and we have 
(d) lim $n(4) = (4); 


no 


moreover the sequence {¢,(#)} is positive and non-decreasing. Hence* we 
have 


lim f on (x) dx = f o (x)dx. 


Now for any fixed m let the symmetrical net N of 2?" congruent cells be con- 
sidered, and let J; (7=1, 2, 3, - - - , 2") be the jth of the 2* equal parts into 
which W divides the interval a<x<b; then we have 


ff 


j=1 q; 


Let B; denote the least upper bound of ¢,(#) in 7;; then 


2” 
ff s | 


j=1 T; 


= [(6 — a)/2"] B;. 


j=l 


* See, for example, de la Vallée Poussin, Cours d’Analyse Infinitésimale, vol. 1, Paris, 1914, p. 
264, Theorem III. 


{ 

4 

4 

4 

2n 


848 J. A. CLARKSON AND C. R. ADAMS [October 


But B; is at most equal to the sum of the oscillations w/ in the jth column of 
cells of NV, whence 


[(6 — a)/2"] (6 — a) /2 
< M(b— a). 


Thus for all [¢,(x)dx is <M(b—a) and consequently, by (d), is 
summable. Since the same reasoning holds for ¥(4), the relation T-C=>P-C 
is proved. 

We now construct example 

(J), a function f(x, y) in class T-C but not P, thus establishing the rela- 
tion T-C>P-C. To this end we employ a result of Tonelli,* that if f(x, y) 
is continuous and if the surface z =/(zx, y) is of finite areaf, f(x, y) isin class T. 

Let N; (j=2, 3, 4, - - -) be the net which divides I, the unit square, into 
2%i-) equal subsquares Q;; (t=1, 2, 3,---, Thus Nj4: divides each 
subsquare Q);; of N; into four equal subsquares. We shall define the function 
f(x, y) over I by a surface Z which will in turn be defined as the limit of a 
sequence {Z;} of polyhedral surfaces over I, Z; corresponding to the net Nj. 

Let Z, be a regular pyramid A, whose base is I and altitude 1. Its surface 
area may be denoted by S/2. 

Let Z; be identical with Z; except over the squares of a set Q/ concentric 
with the squares Q2 of N2. Let a second set of smaller concentric squares Q,’ 
be chosen. The squares of Q; may be taken as small as desired, and, these 
having been chosen, the squares of Q/’ may be selected as small as desired. 
Limitations on their size are presently to be imposed. 

As a first limitation on Q/ let the oscillation of Z; be less than 3 in each 
square of Q/. Within Q2/’ (where this is the square of Q/’ interior to Q2;, in 
turn interior to Q2;) define Z, as a regular pyramid A; of altitude 3 and with 
base in a horizontal plane. The plane of the base of A:; may be so chosen that 
A>; lies wholly between the two horizontal planes through the lowest and 
highest points of Z,. 

Figure 1 is intended to indicate a top elevation of the part of the surface 
Z; now being described. ABCD is the space quadrilateral on Z, whose pro- 
jection on the xy-plane is Q.;; A’B’C’D’ is the space quadrilateral on 2; 
whose projection is Q2;; and A” B’C”’D” is the base of the pyramid A; whose 
projection is (0:;. Let a, b,c, andd be the mid-points of the sides of A” B’'C’’D”. 
Then plane triangles may be interpolated between the space quadrilateral 


* See Tonelli, loc. cit. 
t In the sense of Lebesgue. 


1933] DEFINITIONS OF BOUNDED VARIATION 849 


A'B'C'D’ and the base of A:;; these triangles are A’A’’a, A’aB’, aB’’B’, etc. 
The plane triangles thus interpolated we use to define the part of Z; standing 
over the region between the two squares 02; and one Z, so defined is continu- 
ous within Q3;, and hence throughout I. The position of the plane of the base 
of As; is further restricted merely by the condition that the oscillation of Z; 
in Q2; (which by the presence of the pyramid A,; is not less than }) shall be 3. 
Evidently, by decreasing the size of the squares (2; and Qs;, we may make the 


A B 


Fig. 1 


surface area of Z, within Q;; as small as we wish; hence we may impose the 
final limitation upon the size of the squares, that the resulting area of Z: 
shall not exceed S(3+4). To provide for further subdivision we require that 
the lengths of the sides of the squares in Q2, Q/, and Qz’ be relatively in- 
commensurable. 

Each succeeding surface Z, is defined by means of the surface Zp in a 
similar manner. Let Z, be identical with Z,_; except over the squares of a 
set Q, concentric with the squares Q, of N,. Let om be chosen sufficiently 
small so that its perimeter does not intersect the perimeter of any previously 
chosen Qj; or (4. Let a second set of smaller concentric squares Q,’ be chosen. 

As the next limitation on Q; let the oscillation of Z,_: be less than 1/f in 
each square of Q,. Within Q); (where this is the square of Q,’ interior to 
Q},, in turn interior to Q,;) define Z, as a regular pyramid A,,; of altitude 1/p 
and with base in a horizontal plane. Q); lies entirely within some smallest 
previously chosen Qj; (which may be I itself), Q4n. The plane of the base of 
A,; may be so chosen that A,,; lies wholly between the two horizontal planes 
through the highest and lowest points of Z,_1 in Q‘nn. 


850 J. A. CLARKSON AND C. R. ADAMS [October 


Figure 2 is intended to indicate a top elevation of the part of the surface 
Z, now being described. A BCD is the space polygon on Z,_; whose projection 
on the xy-plane is Q,:; A’pip. - -- B’---C’---D’--- is the space polygon 
on Z,_1 whose projection is Q;;; and A” BCD” is the base of the pyramid 
A,; whose projection is QO}; . Let a, b, c, and d be the mid-points of the sides 
of A” B”’C”D”. Then plane triangles may be interpolated between the space 
polygon A’p:p,--- B’.--C’---D’---and the base of A,,;; these triangles 


A B 


Fig. 2 


are A’ap:, pide, etc. The plane triangles thus interpolated we use to 
define the part of Z, standing over the region between the two squares Q) 
and QO}; ; Z, is then continuous within Q,/,and hence throughout I. The po- 
sition of the plane of the base of A,; is further restricted merely by the con- 
dition that the oscillation of Z, in Q); (which by the presence of the pyramid 
A,; is at least 1/p) shall be 1/p. The final limitation upon the size of the 
squares Q/; and 0}; is that the resulting surface area of Z, shall not exceed 
S($+4+4+ ---+1/2°). In order to show that this result may be effected, 
we need only prove that the total surface area of Z, thus defined within 0}; 
may be made arbitrarily small by choosing the squares Q); and 0}; sufficient- 
ly small. 

Let any e>0 be given. Then clearly there exists a 5; such that if the side 
of Q}; be taken less than 6,, the surface area S’ of the pyramid A,; will be less 
than «/2. Now consider S”’, the total surface area of the plane triangles be- 
tween Q); and Oy. Of these triangles eight have the property that each has 


1933] DEFINITIONS OF BOUNDED VARIATION 851 


one side which coincides with half of a side of the base of A,;; moreover the 
length of each of its other sides is bounded by ((d/2)?+(1/p)*)!/?, where d is 
the length of the diagonal of O:; whence we may assert that there exists a 6, 
such that if the side of QY; be taken less than 6 in length, the surface area of 
these eight triangles will be less than €/4. There remain to be considered the 
rest of the triangles which contribute to S’’. Each of these has one side whose 
length is bounded by (w?+/*)!/2, where w is the oscillation of Zp: in Q); and / 
is the length of a side of Q},; likewise each of its other sides is bounded by 
((d/2)?+(1/p)*)'/2. Moreover the number of these triangles is limited, since 
the surface Z,_; consists of a finite number of plane pieces. Now by taking 
Ohi sufficiently small we may make w and /, and consequently one side of each 
of these triangles, as small as we please; hence there exists a 5; such that if 
the side of Q/; be taken less than 43, the combined areas of these remaining 
triangles will be less than ¢/4. If, then, we require that the side of Q}; be less 
than 6s, and the side of Q}; be less than 6; and é2, the total area of the part of 
Z, within Q); will be less than e. 

Finally, to provide for further subdivision, we take the lengths of the 
sides of the squares in Q,, Q,, and Q,’ relatively incommensurable. 

Then if P is any point of I, and /; denotes the height of Z; over P, the 
sequence {h;} approaches a limit as j increases indefinitely. For if P does not 
lie within an infinite number of squares Qj, all the h,’s are equal for suf- 
ficiently large j. If P does lie within an infinite sequence of such squares, we 
have |4;—h,| <1/p for all >, and so the sequence {h;} converges. Let the 
surface Z be defined as the limit of the sequence {Z;}. 

Inasmuch as each Z; is continuous, and the sequence {Z;} converges 
uniformly, the surface Z is continuous and defines a continuous function 
j(x, y) over I. Moreover, as Z may be approximated arbitrarily closely by 
one of the sequence {Z;} of polyhedral surfaces, each of which is of area less 
than S, the area of Z does not exceed S; hence f(x, y) is in class T. But for 
each net NV; we have 


/n = = 


and as the latter quantity increases indefinitely with 7, the function f(x, y) 
is not in class P. 

Proof of (9c). Example (B); example (I). 

Proof of (10c). Example (B); example (H). 

Proof of (11c). Example (B); example (H). 

Proof of (14c). Example (E) has already been given to exhibit a func- 
tion which is in class F but not in class V. We now show how this example 


| 


852 J. A. CLARKSON AND C. R. ADAMS [October 


may be modified so as to be continuous without otherwise essentially alter- 
ing its character. 

Consider a “point-rectangle function” f(x, y) such as is used in example 
(E), with |f| =1 on the array of 12*-' points p,; in R and with (max F(f)) 
/(max V(f)) <e. Surround each point p;; by a square Qj; with sides parallel 
to the axes and with #;; as center; all these squares are taken equal in size 
and small enough so that they do not abut or overlap. 

Let f’(x, y) be defined on R as follows: within each Q;; let f’(x, y) be de- 
fined by the surface of a regular pyramid whose base is Q;; and whose height 
is f(p:;); let f’(x, y) =0 at all other points of R. This function is continuous on 
R, and for it also we have 


(e) (max F(f’))/(max V(f’)) < e. 


For the following inequalities are easily seen to hold: 


max F(f’) > maxF(f), max V(f’) = max V(f); 


and we shall show that max F(f’) does not exceed max F(f), whence (e) will 
follow. 

Let WN be any net on R. Construct a second net N’ by adding lines to NV 
as follows. Add the horizontal lines through the center and upper and lower 
sides of Qu, and the corresponding vertical lines. If a herizontal line / of V 
passes through Qu, add to N the horizontal line /’ so that pu is equidistant from 


l and /’, and also the vertical lines 1’ and i’” at the same distance from pu. 


For each Q;; add to N four lines bearing the same relation to it that J, J’, 1’ 
and /’” bear to Qu. Let this construction be carried out for each horizontal 
and vertical line through Qu, and then the process repeated for every other 

The net N’ thus defined is symmetric; each Q;; is divided by N’ in pre- 
cisely the same way into ?? rectangular subregions which are in general not 
square except for those along the diagonals, and for each of which, excepting 
those along the diagonals, the difference Auf’(x;,y;) vanishes. For each of 
the squares along the diagonals the difference is in absolute value equal to 
one of the ¢/2 values a, b,c, - - - , k, wherea+b+c+ --- +k=1. The values 
of ¢ and of the numbers a, 5, c, - - - , depend upon the net N’. The situation 
for a particular Q;; may be represented as in Figure 3, where the number 
within each cell is the value of the difference Auf’ for that cell. The figure 
represents a Q;; when f’(p;;) = +1 and ¢/2 =4. It will be seen that the e,’s and 
€,’s in the sum Fy-(f’) which attach to the cells yielding +a may be chosen 
independently of those which attach to the cells yielding +3, etc., and that 


1933] DEFINITIONS OF BOUNDED VARIATION 853 
the maximum contribution obtainable from the cells yielding +a is a-max 
F(f); from those yielding +, b-max F(f); etc. Hence we have 

max Fy-(f’) = a-max F(f) + 6-max F(f) + --- + &-maxF(f) = max F(f). 


But as max Fy(f’) is < max Fy,(f’) (since N’ was obtained from N by adding 
lines) and N was an arbitrary net, we conclude that max F(f’) =max F(f), 
which was to be proved. 


0 


0 


Fig. 3 


(K) Since a “point-rectangle function” may be made continuous while re- 
taining the same values for max F(f) and max V(f), we may clearly construct 
example (K) by modifying example (E) in this way, and so obtain a continu- 
ous function which is in class F but not V. 

Proof of (15c). Example (B) ; example (H). 

Proof of (16c). Example (B); relation (15c). 

Proof of (17c). Example (B); relations (5) and (15c). 

Proof of (23c). Example (K) is readily seen to be in class A-F-C, but not 
in V and hence not in H. Since we have H:CSA-F.-C, (23c) follows. 

Proof of (24c). Relation (23c). 

Proof of (25c). Relation (23c). 

Proof of (33c). Example (K); example (B). 

Proof of (34c). Relation (33c); example (B). 

Proof of (35c). Relation (33c); example (B). 

8. Open questions. The following is a complete list of pairs of classes the 


| 

| 

| 

PING | 

- 


854 J. A. CLARKSON AND C. R. ADAMS 

relations between which are not yet fully determined; in each case we give 
in parentheses a partial determination of the relation, with a reason therefor. 
(45) P,F-T (P £F-T by (13)), 

(46) A, P-F (A = P-F by (15)), 

(47) A-F, P-F (A-F < P-F by (1)), 

(48) A-T, P-F (A-T £ P-F by (31)), 

(49) P-F,F-T (P-F S F-T by (27)), 

(50) P-T,F-T (P-TEF-T* (32)), 

(36c) AC (A-C £ F-T-C by (15c)), 

(41c) A-F-C,F-T-C (A-F-C F-T-C by (1c)), 

(45c) P-C,F-T-C (P-C £F-T-C by (16c)), 

(46c) A-C, P-F-C (A-C £ P-F-C by (15c)), 

(47c) A-F-C, P-F-C (A-F-C S P-F-C by (1)), 

(49c) P-F-C,F-T-C (P-F-C < F-T-C by (1c)). 


The relations still to be determined present some interesting, but proba- 
bly not simple, problems. We would hazard no conjecture concerning their 
nature except in the case of (36c) and (41c), the first of which is probably an 
overlapping relation and the second a definite inequality; that such is the 


case could be established by modifying example (G) so as to be continuous 
while preserving its other properties. We have no doubt that this modifica- 
tion is possible, but see no way to do it easily. 


BROWN UNIVERSITY, 
PROVIDENCE, R.I. 


THE GENERAL WEB OF ALGEBRAIC SURFACES OF 
ORDER n AND THE INVOLUTION DEFINED BY IT* 


BY 
TEMPLE R. HOLLCROFT 


1. Introduction. Webs of quadric surfaces have been studied quite ex- 
tensivelyt, but, except for finding the characteristics of its jacobianft the 
general web has not been treated for »>2. 

The surfaces of a web are in (1, 1) correspondence with the planes of three- 
space. In the case of a general web of surfaces of order m, this correspondence 
establishes a space involution of order n*. For n =2, this involution has been 
treated by Snyder and Sharpe.§ 

In the present paper, the properties of both the general web of algebraic 
surfaces of order m and the space involution associated with such a web are 
obtained. 

2. The web. The equation of a web of algebraic surfaces is 


DL afi = 0 (i = 1, 2, 3, 4), 


in which the \; are homogeneous parameters and the f; are homogeneous, 
algebraic functions of order m in the variables x1, x2, %3, x4. The general web 


treated in this paper is such that the coefficients in each of the four f; 
defining the web are unrestricted, that is, the f; represent non-singular 
surfaces and the web has no basis points or curves. 

The jacobian J of a web of surfaces|| is the locus of nodes§ and also of 
contacts of surfaces of the web. It is a surface of order 4(m—1). Fora general 
web, J has no singularities and is, therefore, of genus 


= 3(2n — 3)(4n — 5)(4n — 7). 


The characteristics of J result immediately from the Cayley fcrmulas for 
the characteristics of a non-singular surface.** 


* Presented to the Society, December 29, 1932; received by the editors February 7, 1933. 

t Pascal, Repertorium der héheren Mathematik, vol. II, (1922), pp. 629-631. Encyklopiidie der 
Mathematischen Wissenschaften, vol. IIIz, pp. 250-254. 

t Pascal, loc. cit., p. 680. 

§ Virgil Snyder and F. R. Sharpe, Space involutions defined by a web of quadrics, these Transactions, 
vol. 19 (1918), pp. 275-290. 

|| As only algebraic curves and surfaces are treated in this paper, the adjective “algebraic” will 
be omitted from this point on. 

] The term “node” is used to mean an entirely general conic node. 

** Pascal, loc. cit., pp. 696-697. 
855 


| 
| 
if 
| 
De 


856 T. R. HOLLCROFT [October 


Since a web involves three essential parameters, the surfaces of the web 
may have contacts or singularities associated with one, two or three in- 
variants. The following singularities on one surface and contacts of surfaces 
are associated with one, two or three invariants: 

I. One invariant. (a) One node or one contact. 

II. Two invariants. (a) Two nodes or two contacts. (b) One binode or 
one stationary contact. 

III. Three invariants. (a) Three nodes or three contacts. (b) One binode 
and one node or one stationary and one simple contact. (c) One special 
binode B, whose axis has four-point contact with the surface or one contact 
such that the curve of intersection has a tacnode at the point of contact. 

One invariant defines a doubly infinite system of surfaces of the web, the 
locus of whose singularities or contacts associated with the given invariant 
is the jacobian surface J. Thus J is the locus of nodes and contacts belonging 
to both doubly infinite systems of surfaces of the web. 

Two invariants define a singly infinite system of surfaces of the web, the 
loci of whose singularities or contacts associated with the two given invariants 
are curves on J. The characteristics of these four curves are obtained in §§6 
and 7. 

Three invariants define a finite system of surfaces of the web whose singu- 
larities or contacts associated with the three given invariants lie at certain 
intersections or contacts of the above curves on J. The positions and num- 
bers of these for each of the six finite systems are also found in §§6 and 7. 

3. The involution defined by the web.* The (1, 1) correspondence existing 
between the surfaces of the web and the planes of the three-space (y) is 
defined by the equations 


evi = fi (i = 1, 2, 3, 4). 


To a line of (y) corresponds a space curve of order m? and genus (m—1) 
-(n?—n—1), the basis curve of a pencil of surfaces of the web. The surfaces 
of this pencil are the images of the planes of (y) belonging to a pencil whose 
axis is the given line. 

Conversely, the complete image of a surface of the web or of the basis 
curve of a pencil of surfaces is the corresponding plane or line of (y) respec- 
tively counted n* times. 

To a bundle of planes through a point P of (y) corresponds a net of sur- 
faces of the web with n* basis points. These m* basis points are all images of P. 
Conversely, to each of these * points of (x) corresponds the given point P 


* The results of this section are chiefly generalizations of those obtained by Snyder and Sharpe 
(loc. cit.) for n=2. 


1933] GENERAL WEBS OF ALGEBRAIC SURFACES 857 


of (y). The unique correspondence of planes of (y) to surfaces of the web 
therefore establishes an involution of order m* between the spaces (x) and (y). 

The locus of the points of (y), two of whose u* image points coincide, is a 
surface L called the branch-point surface of the transformation. The cor- 
responding locus of coincidences is J. L and J are in (1, 1) correspondence. 
The order of L is the order of the tact-invariant of two surfaces of the web, 
which is 4n?(n—1). 

The complete image of Z is J counted twice and a residual surface R of 
order 4(m—1)(n*—2). To a point of LZ correspond two coincident points of J 
and n*—2 distinct points of R. 

The complete image of J is L. The complete image of R is L counted n*—2 
times. 

The image of a plane =z of (x) is a rational surface s of order n? in (y), 
whose only singularity is a nodal curve. The image of s is the plane 7 and a 
residual surface s; of order n*—1. The plane 7 meets J in a curve of order 
4(n—1) through which s, passes. The residual intersection of 7 and s; is a 
plane curve of order (n—1)(n?+-n—3), the image of the nodal curve of s. 
The nodal curve of s is, therefore, of order 4n(n—1)(n?+n—3) and since s 
is rational, the rank of its nodal curve is (n —1)(n?—3)(2n*+2n?—7n+2)/3. 

The surface s intersects L in a curve of order 4n*(m—1), consisting of a 
contact curve of order 4n(n—1), the image of the intersection curve of r 
and J; and an intersection curve of order 4n(n—1)(n*—2), the image of the 
plane section of R by z. 

The images of linear systems of planes of (x) do not form linear systems 
of surfaces in (y). The image of a pencil of planes of (x) is a singly infinite, 
non-linear system of surfaces in (y) of order n?. This system of surfaces has 
in common a curve of order m, the image of the line which is the axis of the 
pencil of planes of (x). This curve C, is rational and has }#(m—1)(n—2) 
apparent double points. Any two image surfaces of planes of the pencil inter- 
sect in C, and a residual curve C’ of order m(m*—1) and genus m(3n*—5n? 
—4n+7) with }(n—1)[n(n—1)(n+1)2(n*—2)—n—2] apparent double 
points. C’ also has n(m—1)(n?—1)(n?+n—3) nodes at its intersections with 
the two nodal curves of the two surfaces. 

4. Nets contained in the web. The points of (y), each considered as 
bearing a bundle of planes, determine a triple infinity of nets of surfaces of 
the web. 

Consider the net F of surfaces corresponding to the bundle of planes 
through an arbitrary point P of (y). The properties of F are uniquely as- 
sociated with certain characteristics of the branch-point curve Z; of a plane 
a, whose lines are in (1, 1) correspondence with the surfaces of F. The follow- 


858 T. R. HOLLCROFT [October 


ing characteristics of Z, (which will be used in the next section to obtain the 
characteristics of the branch-point surface L of the web) result on setting i =3 
in the formulas for a net of hypersurfaces in 7 dimensions*: 


n, = 6n(n — 1)?; 
m, = 4(n — 

5, = (m — 1)?[18n2(m — 1)? — 59n + 74]; 

kK, = 12(m — — 4); 

7,1 = 2(m — 1)?(m — 2)(4n* — 8n? + 8n — 25); 
t, = 30(m — 1)?(m — 2). 


The jacobian curve J; of the net is of order 6(m—1)? and is the locus of 
both nodes and contacts of surfaces of the net. J is also the coincidence curve 
of the transformation. To nodes 6;, cusps m, bitangents 7, stationary tan- 
gents «, of LZ, correspond uniquely and respectively surfaces of F with two 
contacts, one stationary contact, two nodes, one binode. 

The surfaces of F that have a node are surfaces of the web and therefore 
all these nodes lie on the surface J. The respective jacobian curves of the 
« * nets contained in the web form a triply infinite linear system of curves on 
J, that is, the jacobians of the nets of the web build a web of curves on the 
jacobian of the web. 

Two points P and P’ of (y) determine the line PP’ carrying an axial 
pencil whose planes are common to the two bundles on P and P’. Correspond- 
ing to the two bundles and to their common axial pencil, there are in (x) two 
sets of surfaces with a pencil of surfaces in common. A pencil of surfaces 
contains 4(m—1)* surfaces with a node. These surfaces belong to both nets and 
therefore their 4(m—1)* nodes lie on both jacobian curves. Therefore, in the 
web of jacobian curves on J, any two jacobian curves intersect in 4(m—1)° 
points. 

To a bundle of lines on any point P of (y) corresponds a net of curves 
of order m? and genus (n—1)(m?—n—1) with n* basis points. These curves 
are the intersections of the surfaces of the net corresponding to the bundle 
of planes on P. The nodes of this net of curves lie at contacts of surfaces of 
the associated net. Any such net of curves has, therefore, the same jacobian 
curve J; as the associated net of surfaces. 

5. The characteristics of L. The characteristics of the branch-point sur- 
face L will be represented by the following symbols: 

N [n’] order [class]; 

a[a’] order of tangent cone [class of plane section }; 


* Hollcroft, Nets of manifolds in i dimensions, Annali di Matematica, (4), vol. 5 (1927-28), p. 265. 


1933] GENERAL WEBS OF ALGEBRAIC SURFACES 


x’ [6’] number of inflections [bitangents] of plane section; 

x [6] number of cuspidal [nodal] lines of tangent cone; 

b[c] order of nodal [cuspidal] curve; 

b’ [c’] class of bitangential [spinodal ] developable; 

q|q’] class [order] of nodal curve [bitangential developable]; 

r[r’] class [order] of cuspidal curve [spinodal developable]; 

B[y](z) number of intersections of nodal and cuspidal curves which are 
cusps on cuspidal [nodal ] (neither) curve; 

6’ [y’](¢’) number of common planes of bitangential and spinodal de- 
velopables which are stationary on the spinodal [bitangential] (neither) 
developable; 

t[t’] number of triple points [planes] of nodal curve [bitangential de- 
velopable }; 

p’ [o’ |(¢’) order of bitangential [spinodal] (flecnodal) curve; 

p[c] class of nodal [cuspidal ] developable. 

The order of L was found in §3. The class of L is the order of the discrim- 
inant of a pencil of surfaces of the web, which is 4(m—1)°. 

The characteristics of the tangent cone to Z from any point will now be 
obtained. 

To a bundle of planes of (y) on an arbitrary point P corresponds a net F of 
surfaces of the web. Of the ? planes through P, a single infinity are tangent 
to L, enveloping the tangent cone to L from P. The planes through P tangent 
to Z correspond uniquely to the single infinity of surfaces of the net that 
have a node. The planes enveloping the tangent cone to Z from P are, there- 
fore, in (1, 1) correspondence with the points of the jacobian curve J; of F. 

The plane section Z; of this tangent cone made by any plane 7 not 
through P is enveloped by lines which are sections by 7; of the enveloping 
tangent planes of the cone. The lines enveloping Z; are thus in (1, 1) cor- 
respondence with the points of J,. Therefore, since the lines of 7; (sections 
by 7: of the planes of the bundle) are in (1, 1) correspondence with the sur- 
faces of F, the plane section L; of the tangent cone is the branch-point curve 
in the transformation of the lines of 7 into the surfaces of F. From the 
preceding section, the order of L; is 6n(n—1)? which is, therefore, the order 
a of the tangent cone to L from any point. 

The number of bitangent [stationary] planes from P to L is the class b’ 
[c’] of the bitangential [spinodal] developable of L. These planes are cut by 
m in bitangents [stationary tangents] to the plane curve Z;. The number of 
bitangents 7; and stationary tangents « of L; were obtained in the preceding 
section. These values of 7; and « are, therefore, the values of b’ and c’ re- 
spectively. 


859 


860 T. R. HOLLCROFT [October 


The number of nodes [cusps] of the plane section LZ; is the number of 
nodal [cuspidal] generators 5[«] of the tangent cone. The number of nodes 
6, and cusps «; of LZ; were obtained in the preceding section. These are the 
values of 6 and x respectively. 

Since, for any algebraic surface, a =a’, the class of a plane section of L is 
6n(n —1)?. 

To find the genus of a plane section of L, consider the plane section L’ 
of LZ made by any plane 7’. L’, of order 4n?(m—1), has for its image in (x) 
the space curve J’ which is the complete intersection of J and the surface 
of the web which is the image of 7’. This curve J’ is of order 4n(mn—1) and 
has 2n(m—1)?(4n—5) apparent double points. Its genus p’ is, therefore, 
p’ =2n(n—1)(Sn—8)+1 which is also the genus of the plane section L’ of L 
since L’ and J’ are in (1, 1) correspondence. 

Knowing the order, class and genus of a plane section of L, the number 
of its nodes, cusps, bitangents 4’, inflections x’ can be found by means of 
Pliicker’s equations. The numbers of its nodes and cusps are respectively 
the orders 6 and c of the nodal and cuspidal curves of L. 

The genus D of L is the same as that found for J in §3, since Z and J are 
in (1, 1) correspondence. L has no singularities other than a nodal and a cus- 
pidal curve. 

In the above, the values of the following characteristics of ZL have been 
obtained: NV, n’, a, a’, b, b’, c, c’, 6, 5’, x, «’, D. From these, by use of the 
Cayley-Zeuthen equations,* the values of p, p’, result immediately 


and the remaining characteristics are found by the solution of sets of linear 
equations. 
The characteristics of L obtained above are as follows: 


N = 4n?(n — 1); 
n' = 4(n — 1); 
’= a = 6n(n — 1)?; 

b 2n(n — 1)(4n* — 4n? — 19n + 21); 

b’ = 2(m — 1)*(m — 2)(4n® — 8n? + 8n — 25); 
= 2n(n — 1)(11n — 13); | 
= 30(n — 1)*(n — 2); 
= (m — 1)*[18%2(m — 1)? — 59n + 74]; 
= n(n — 1)[18n(n — 1) — 47n + 69]; 
= 12(m — 1)?(3n — 4); 


* Salmon, Geometry of Three Dimensions, 4th edition, 1882, pp. 596-600; Pascal, loc. 
692-693. 


1933] GENERAL WEBS OF ALGEBRAIC SURFACES 

x’ = 4n(n — 1)(7n — 11); 

o = 2(m — 1)?(19” — 22); 

o’ = 10n(n — 1)(3n — 5); 

p = 4(n — 1)*(6n* — 6n* — 31n + 34); 

p! = 4n(n — 1)[6(m — 1)* — 25n + 39]; 

= 2n(m — 1)(73m — 111); 

r = 4(m — 1)(26n? — 62n + 37); 

r’ = 20(n — 1)(6n? — 18” + 13); 

B = 8(n — 1)(3n — 4)(6n — 7); 

= 40(n — 1)(2n — 3)(3n — 7); 

q = 4(m — 1)[6n*(m — 1)? — 49n? + 110n — 62]; 

q’ = 4(m — 1)[6n(m — 1)* — 55m? + — 105]; 
= 8(m — 1)(11m5 — 24n* + 1303 — 87n? + 207m — 123); 
= 120(m — 1)(n5 — 6n* + 14n* — 25n? + 42n — 31); 


8 
1) [4n®(n — 1)? — + 120n* — 63n* + 250n? — 574n + 330]; 


8 
= — 1)[4(m — 1)® — 755 + 438n4 — 1002n* + 1498n? — + 1256]; 


j= = 0; 
D = 3(2n — 3)(4n — 5)(4n — 7). 


6. Loci of contacts and coincidences. If the vertex P of a bundle of planes 
of (y) is on L, at the image point P; on J the surfaces of the corresponding 
net have a common tangent line / which lies in the tangent plane to J at P,. 
This line / is tangent at P; to all the image curves of all the lines through P. 

Of this net, the surfaces of a pencil corresponding to the planes of a pencil 
whose axis is tangent to L at P have contact at P,. Since in the tangent plane 
to L at P, all the lines of the flat pencil with vertex at P are tangent to L at 
P, and since the axial pencils on the lines of this flat pencil include all the 
planes of the bundle on P, the net of surfaces which have contact with a line 
at P; is composed of a simply infinite linear system of pencils of surfaces such 
that the surfaces of each pencil have contact with each other at P:, but do 
not have contact (except along the line) with any surface of another pencil. 
Of these surfaces, the one surface common to all the pencils, viz., the surface 
corresponding to the tangent plane to L at P, has a node at P. 

The images on J of the cuspidal curve c and the nodal curve b of L are 
curves of orders c, and b, respectively. 


— 


862 T. R. HOLLCROFT [October 


If P lies on the cuspidal curve c of L, the surfaces of the corresponding net 
all osculate the curve c, at the image point P;. Of this net, the surfaces of the 
pencil corresponding to the planes of the pencil whose axis is the tangent to 
c at P, all have stationary contact with each other at P,. 

If the surfaces of a net have a common osculating curve at a point P,, 
three of the ”* basis points of the net occur at P,;. Then the cuspidal curve of 
L is the locus of points P such that at each image point P; on J occurs 

(1) the coincidence of three images of P; 

(2) one stationary contact of the surfaces of a pencil of the web. 

If P lies on the nodal curve b of L, the surfaces of the corresponding net 
all have two distinct contacts with the image curve },. One of these contacts 
is at P, on J, the image of P considered on one sheet of Z through 5; the other 
is at P2 on J, the image of P considered on the other sheet of Z through b. Of 
this net, the surfaces of the pencil corresponding to the planes through the 
tangent line to b at P all have contact with each other at P; and at P:. Of 
this pencil, one surface corresponding to the tangent plane at P to one sheet 
of L has a node at P; and another surface corresponding to the tangent plane 
at P to the other sheet of Z has a node at P». 

If all the surfaces of a net have two distinct contacts with a curve at P,; 
and P2, two of the n* basis points coincide at P; and also two at P2. Then the 
nodal curve of L is the locus of points P such that at each of the two distinct 
image points P; and P; of J occur 


(1) two coincident images of P; 
(2) one contact of the surfaces of a pencil of the web. 


Since the points of J at which occur more than a simple coincidence, or 
at which occur a combination of simple coincidences, must lie also on the 
residual surface R, they must lie on the curves common to these two surfaces. 
The images of the nodal and cuspidal curves of Z must, therefore, be the 
curves in which J and R intersect. The surfaces J and R intersect in a com- 
posite curve of order 16(m—1)?(n*—2). 

The complete image of the nodal curve 6 of L is the intersection curve }; 
of J and R of order 

b, = 4(m — 1)(4n* — 4n* — 19m + 21) 


counted twice and a residual curve }: on R of order 
bs = 2(n — 1)(n* — 4)(4n* — — 19n + 21), 


which is the nodal curve of R. The complete images of b; and bz are b counted 
twice and *—4 times respectively. The curves b and }, are in (1, 2) point 
correspondence. 


1933} GENERAL WEBS OF ALGEBRAIC SURFACES 863 


The complete image of the cuspidal curve c of Z is the contact curve 
of J and R of order 


= 2(m — 1)(11" — 13) 
counted three times and a residual curve cz on R of order 
Ce = 2(m — 1)(m* — 3)(11n — 13), 


which is the cuspidal curve of R. The complete images of c, and ¢2 are c 
counted three times and n*—3 times respectively. The curves c, and ¢ are 
in (1, 1) correspondence. 

From the above discussion of images of points of c, it results that the 
contact curve ¢; of J and R is the locus both of the three-point coincidences 
of the involution and of the stationary contacts of pencils of surfaces of the 
web. The n*—3 residual images of each point of c lie on ¢. 

Similarly, from the above discussion of images of points of b, it results 
that the intersection curve }; of J and R is the locus both of the pairs of 
simple coincidences of the involution and of the contacts at two distinct 
points of pencils of surfaces of the web. The m’—4 residual images of each 
point of b lie on be. 

At a point P on L, one of the y intersections of the curves b and ¢ which 
is a cusp on J, the two sheets of Z through ¢ are intersected by another sheet 
of L. The bundle of planes through P corresponds to a net of surfaces all of 
which (1) osculate the curve c; at P;, the image of P considered as on the two 
sheets of L through c that have a common tangent plane, and (2) have con- 
tact with 5, at P2, the image of P considered as on the other sheet of L. Of 
the n* images of such a point P, three coincide at P; and two at P:. The planes 
of the pencil whose axis is the cuspidal tangent to b at P correspond to sur- 
faces of a pencil of the web which have stationary contact at P; and simple 
contact at P». 

Therefore, the number of pencils of surfaces of the web that have two 
distinct contacts, one of which is stationary, and also the number of pairs of 
coincidences, each pair consisting of one three-point and one two-point 
coincidence, is 


y = 8(m — 1)(11n5 — 24n4 + 133 — 87n? + — 123). 


The points P; lie at intersections of the curves 5, and c, and the points 
P; lie on b;. The curve cz passes through the points P; and b2 passes through 
both the points P; and P2. Then ¢ intersects J in y points at the points P;, 
and be intersects J in 2y points at the points P; and P2. 

The n*—5 residual images of each of the y points of L lie at n*—5 inter- 
sections 72 of bz and cz which are cusps on be. The number y2= (n*—5)y. 


864 T. R. HOLLCROFT [October 


Consider a point P on L at one of the 6 intersections of 6 and c which are 
cusps on c. At such a point, there is but one sheet of L. A plane through P 
intersects Z in a curve with a triple point at P whose penultimate form con- 
sists of one node and two cusps. The bundle of planes on P corresponds to a 
net of surfaces all of which have four-point contact with c, at the image 
point P, on J. The pencil of planes whose axis is the cuspidal tangent to c 
at P corresponds to the pencil of surfaces which have tacnodal contact at P,. 

The number of pencils of surfaces of the web that have tacnodal contact 
and also the number of four-point coincidences of the involution is, therefore, 


B = 8(n — 1)(3n — 4)(6n — 7). 


A four-point coincidence may be considered as (1) the union of two simple 
coincidences or (2) the addition of one image point to a triple coincidence. 
In case (1), the surfaces of a net which were all tangent to b, at two distinct 
points would all osculate }; at a single point. In case (2), the surfaces all of 
which osculated c, at one point would have four-point contact with ¢, at that 
point. Since both these possibilities must be accounted for, at the points P,, 
b, osculates c,. Since the three image points in the component triple coin- 
cidence of case (2) are symmetrical, the curve c: also osculates ¢; at the points 
P,, and therefore cz has 36 intersections with J at these points. The curve b, 
does not pass through the points P,. 

The n*—4 residual intersections of each of the 6 points of L lie at n*—4 
intersections (2 of bz and which are cusps on The number (2 = —4)B. 

A point P on L at a triple point of b corresponds to three distinct points 
P2, P/, Pi’ on b;. To a bundle of planes through P corresponds a net of 
surfaces all of which have three distinct contacts with b, at P2, P/ , Pi’. Of the 
n* basis points of the net, two coincide at each of these three points, that is, 
such a point P has three pairs of coincident image points. 

A triple point P of 6 is also a triple point of L. Then at P there exists a 
single infinity of tangent planes to Z enveloping the cubic tangent cone. At 
the three image points P2, P/, Pi’ of P, occur, therefore, nodes belonging to 
three singly infinite sets of distinct surfaces of the web corresponding to the 
tangent planes to the three respective sheets of ZL through P. 

To the three tangent lines to 6 at P correspond three pencils of surfaces 
such that the surfaces of the pencils have two contacts at P2, Pi; Ps, Pi’; 
Pi, Pd’ respectively. One plane is common to each of the three pairs of axial 
pencils. These three planes correspond to three surfaces that have contact 
at all three image points. 

Therefore the number of sets of surfaces, each set containing three sur- 
faces that have three distinct contacts with each other, and also the number 


1933] GENERAL WEBS OF ALGEBRAIC SURFACES 865 


of sets of three distinct simple coincidences in the involution, is the number of 
triple points of 


8 
t= 3° — 1)[4n®(n — 1)? — 57n5 + 120n4 — 63n? + 250n? — 547n + 330]. 


The curve b2 intersects 5; at each of the three coincidences. Then at these 
points, b2 intersects J in 3¢ points. 

The residual images of the ¢ points of LZ lie at .=(n*—6)é triple points 
of be. 

The tangent planes at the points 6 and y of L are non-singular. To a 
tangent plane at a point 6 corresponds a surface with a node at the image 
point P,. To the two tangent planes at a point y correspond two distinct sur- 
faces, one with a node at P; and the other with a node at P2. 

The curve ¢: intersects J in 


38 + vy = 8(m — 1)?(m* — 3)(11m — 13) 
points, and the curve b intersects J in 
3t + 2y = 8(m — 1)?(m* — 4)(4n* — — + 21) 


points, which occur as described above on the intersection curve 5, and the 
contact curve ¢; of J and R. 

7. Loci of singularities of surfaces of the web. As originally defined, the 
locus of the nodes of the surfaces of the web is J. 

Toa bitangent plane of Z corresponds a surface of the web with two nodes. 
The two points of contact on LZ correspond respectively and uniquely to the 
two nodes of the image surface. The locus of the contacts of bitangent planes 
of L is the bitangential curve p’ of order 


p’ = 4n(n — 1)[6(m — 1)* — 25n + 39]. 
The locus of the pairs of nodes, each pair of which belongs to one surface, 


is a curve pi on J, the image of p’. The complete image of p’ is the curve pi 


of order 
pi = 4(n — 1)[6(m — 1)4 — 25n + 39] 


counted twice and a residual curve p/ on R of order 
pi = 4(n — 1)(n* — 2)[6(m — 1)4 — 25m + 39]. 
The curves p’ and p; are in (1, 1) correspondence. 
The number of intersections of p/ and p? is 
16(m — 1)*(m* — 2)[6(m — 1)4 — 25 + 39]. 


To a stationary plane of Z corresponds a surface of the web with a binode, 
the binode being the image point on J of the point of contact on LZ. The locus 


866 T. R. HOLLCROFT [October 


of the contacts of stationary tangent planes to L is the parabolic or spinodal 
curve o’ of order 
o’ = 10n(n — 1)(3n — 5). 


The complete image of a’ is the curve o/ on J of order 
of = 10(n — 1)(3n — 5) 

counted twice and a residual curve a? on R of order 
oz = 10(m — 1)(n* — 2)(3n — 5). 


The curve oj on J is the locus of binodes of surfaces of the web. 
The number of intersections of o/ and a? is 
40(m — 1)?(m* — 2)(3n — 5). 

The flecnodal curve ¢’ of order 2n(n—1)(73m—111) is the locus of con- 
tacts of flecnodal tangent planes of Z. To a flecnodal tangent plane, corre- 
sponds a surface of the web with a node, belonging to a pencil of surfaces 
whose basis curve (the image of the inflectional tangent) osculates the tan- 
gent quadric cone of the node at its vertex. To a biflecnodal tangent plane 
whose contact occurs at a node of ¢’, corresponds a surface with a node. This 
surface is common to two pencils of surfaces both of whose basis curves 
osculate the tangent quadric cone of the node at its vertex. While these 
singularities account for two and three invariants respectively, actually the 
only singularity on the image surface in each case is a node. The nodes of the 
image surfaces of flecnodal and biflecnodal tangent planes of Z all lie on a 
curve ¢; of order 2(n—1)(73m—111), the image on J of 9’. 

To a triple tangent plane of Z corresponds a surface of the web with three 
nodes. The three contacts on LZ correspond respectively and uniquely to the 
three nodes of the image surface. The number of surfaces of the web each of 
which has three nodes is the number ?#’ of triple tangent planes to L. This 
number is 


8 
1) [4(m — 1)8 — 75n5 + — 1002n* + 1498n? — 1937n + 1256]. 


These 3¢’ nodes lie at definite points, possibly nodes, of the curve p/ on J. 
To the nodo-cuspidal planes (tangent planes with two contacts one of 
which is stationary) of Z correspond surfaces of the web each with one node 
and one binode. The node and the binode correspond respectively and 
uniquely to the simple and stationary contact. The nodo-cuspidal planes of 
L are the stationary planes y’ of the bitangential developable. Therefore the 
number of surfaces of the web that have both a node and a binode are 


y’ = 120(m — 1)(n> — 6n* + 14n* — 25n? + 42n — 31). 


1933] GENERAL WEBS OF ALGEBRAIC SURFACES 867 


To the tacnodal tangent planes of Z correspond surfaces of the web each 
of which has a special binode B, whose axis has four-point contact with the 
surface. The tacnodal contacts and the binodes B, are in (1, 1) correspond- 
ence. The tacnodal tangent planes of Z are the stationary planes 8’ of the 
spinodal developable. Therefore the number of surfaces of the web with a B, is 

B’ = 40(nm — 1)(2n — 3)(3n — 7). 

At the tacnodal points 6’ of L, the three curves p’, o’ and ¢’ all have the 
tacnodal tangent as tangent and therefore all three have contact with each 
other at these points. The cuspidal curve c also passes through each point 6’, 
intersecting each of the above three curves at these points. 

Since the cuspidal curve ¢ passes through the 6’ tacnodal points of L, 
the 6’ binodes B, lie on the contact curve c, of J and R and therefore at the 
points B,, three image points coincide. At these points p/ and o/ are tangent 
and also the residual curves p? and o? have contact with each other and with 
pi and oy. 

The intersections of o/ and a? lie on the curves c, and }; common to J 
and R. The #’ contacts count as 28’ of these intersections, leaving 40(m —1) 
- [n?(n —1)(3n —5) —2(n —2)(9n —13) ] intersections of o/ and o/ that lie on 
b;. This is likewise the number of intersections of 6 and a’ on L. To a point P 
of intersection of b and a’ correspond one point P, at which occur both a 
contact of surfaces of the net along bd; and a binode, and another point P, at 
which occurs only a contact along b,. The curves a; and o? intersect only at 
the points of 

Of the intersections of pi and p?, 28’ lie on c;. The remaining 16(—1) 
{ (n—1)(n? —2) [6(m —1)4—25n+39 ] —5(2n —3)(3n—7) } intersections lie on 
b; and correspond uniquely to the intersections of b and p’ on L. Associated 
with a point P of intersection of b and p’ is a point P’ on p’, but not on 3, 
which is the other contact of the bitangent plane 7, one of whose contacts is 
at P. The image of P considered on the sheet of Z to which = is tangent is a 
point P, of intersection of p/ and p? on };. At P; occurs a node of the image 
surface f of 7 and also a contact of f and 6. At P2, the image of P considered 
on the other sheet of Z is simply a contact of f (and the other surfaces of the 
net) with },. Also, f has a node at P , the image of P’. 

In addition to 6’ contacts, the curves p’ and a’ of L intersect at the y’ 
cuspidal points of the nodo-cuspidal planes. The binodes of the y’ surfaces of 
the web that have both a node and a binode therefore lie at y’ intersections 
of p{ and of on J, and the associated nodes lie at fixed points on p/. 

8. The steinerian. The jacobian surface J may also be defined as the locus 
of points P,; whose polar planes with respect to all surfaces of the web are con- 
current at points P2. The points P, are nodes of surfaces of the web. The locus 


868 T. R. HOLLCROFT 


of the associated points P: is a surface, the steinerian S of the web. J and S 
are thus in (1, 1) correspondence. 

The branch-point surface L is the envelope of planes which are in (1, 1) 
correspondence with the nodes of surfaces of the web which, in turn, are in 
(1, 1) correspondence with the points of S. Then the tangent planes of Z and 
the points of S are in (1, 1) correspondence. We have, therefore, 

The branch-point surface L of the involution defined by the web and the stein- 
erian S of the web are reciprocal surfaces. 

All the characteristics of S may be obtained from the characteristics of 
L given at the end of §5 by merely interchanging the accented and unac- 
cented symbols. 

9. Webs of quadrics. In a web of quadrics, J and S coincide. Z and J are 
not reciprocal surfaces, however. 

Since three conditions are sufficient for a quadric to degenerate into two 
planes, the web contains a finite number of composite quadrics. This number 
is ten. The axes of these composite quadrics are ten lines on the quartic J. 
The images of these lines of J are ten conic tropes C’ of L. L is of order 4 and 
class 16. Its reciprocal is a quartic surface with ten nodes. 

The simplest method, although one not heretofore used, of obtaining the 
characteristics of the quadric web and its associated involution, is to deter- 
mine the characteristics of Z by deriving those of its reciprocal ten-nodal 
quartic. There results the following: 

N=16, n'=4, a=a'=12, C’ 
8=28, 8 =22, =24, o = 32, p = 80, 
b= 60, g = 40, y = 120, t = 80, 
c= 36, r= 68, B = 80, 
padi =f =iz=i =0. 

Among the general formulas obtained for Z in §5, all those defining the 
values of the unaccented characteristics yield the above values for n= 2; but 
the correct values (all of which are zero) for »=2 can not be obtained from 
the general formulas for certain accented characteristics, namely, p’, ’, r’, 
B’,9', 7’, #. 

The cause of this apparent discrepancy is the existence of the ten conic 
tropes on Z which result from the ten degenerate quadrics of the web. Three 
conditions are sufficient for an algebraic surface of order m to degenerate only 
when ”=2, so that for m>2, no general web contains degenerate surfaces. 
In fact, the most general web of quadrics has the same properties as the polar 


web of a cubic surface, so that a general web of surfaces exists only for n 23. 
WELLS COLLEGE, 
Aurora, N. Y. 


AN ITERATIVE PROCESS IN THE 
PROBLEM OF PLATEAU* 


BY 
T. RADO 


INTRODUCTION 


The problem of Plateau calls for a minimal surface bounded by a given 
curve. In analytic formulation, the problem requires the determination of 
three functions x(u, v), y(u, v), 2(u, v), subject to certain boundary conditions 
and such that 


(I) x(u,v), y(u,v), 2(u, v) are harmonic, 
and 
(II) E=G, F =0, 


where 
E= + + F = + + Zuzv,G = + ye + 32. 


In my previous work on this problem, I introduced the notion of ap- 


proximate solutions of the problem by replacing the above exact condition 
(II) by the approximate condition that the two integrals 


f f (E12 — G'/2)2dudv and f f | F | dudo 


be small. The boundary conditions were also replaced by approximate condi- 
tions. I gave a direct construction for the approximate solutions; the exact 
solution was then obtained by a passage to the limit. 

The main purpose of the present paper is to develop an iterative process 
which produces automatically a sequence of approximate solutions. The 
idea of the process (described and discussed in §2) is to comply with the 
above conditions (I) and (II) alternately. The process starts with an arbi- 
trary harmonic surface bounded by the given curve and thus the result de- 
pends, in general, upon the choice of this initial harmonic surface. In this 
sense, the process contains an arbitrary parameter, namely the initial har- 
monic surface $o, and this fact accounts for the great flexibility of the method. 

* Presented to the Society, December 29, 1932; received by the editors February 16, 1933. 


t On Plateau’s problem, Annals of Mathematics, (2), vol. 31 (1930), pp. 457-469; The problem of 
the least area and the problem of Plateau, Mathematische Zeitschrift, vol. 32 (1930), pp. 763-796. 


869 


870 TIBOR RADO [October 


To illustrate this point, let us recall that in general the solution of the prob- 
lem of Plateau is not unique.} The construction of the approximate solutions, 
used in my previous work, yielded a solution of the problem whose area was 
a minimum, and which consequently was not the general solution of the 
problem, since a minimal surface, in general, does not have a minimum area.{ 
On the other hand, it is rather obvious that the iterative process yields all 
the solutions of the problem, including those with minimum area, if the 
parameter §p is chosen in all possible ways. For the sake of simplicity, this 
very trivial remark will be verified only for the case when the given boundary 
curve is a polygon. On the other hand, we shall see that the iterative process, 
if properly applied, can also be made to yield a solution with a minimum area. 

After the approximate solutions have been constructed, a passage to the 
limit is necessary to obtain an exact solution. In my previous work, this 
passage to the limit has been carried out under the assumption that the given 
boundary curve bounds at least one continuous surface with a finite area. 
Replacing a lemma of Courant which I used in my proof by a lemma due to 
J. Douglas, it follows however that the restriction mentioned above can be 
dropped. Thus it follows that in order to obtain the solution for a general 
Jordan curve, it is sufficient to secure approximate solutions (which can be 
constructed directly); it is not necessary to solve the problem of Plateau first 
for some special class of curves. 

Summing up, we have the following picture. Given a general Jordan 
curve, we can always construct approximate solutions of the problem of 
Plateau, and a passage to the limit yields then an exact solution. In case 
the given curve has a finite minimum area, the approximate solutions can 
be constructed in such a way as to obtain an exact solution with a minimum 
area. The construction of the approximate solutions can be based on an auto- 
matic iterative process, properly applied. In case the given curve is a polygon, 
the iterative process can be shown to yield all the solutions of the problem. 

We conclude this introduction with a few remarks concerning the method 
used in this paper. The characteristic feature of the method is the essential 
use of conformal mapping. Indeed, the method is based on the following opera- 
tion. Given a surface 

x(u, y(u, 2(u, v), u? + v? s 1; 
first, change to isothermic parameters, that is to say, change to a representa- 
tion 

S: a2 = x*(u,v), y = y*(u, v), 2 = 2*(u, v), + 0? <1, 


T See also for literature the author’s paper Contributions to the theory of minimal surfaces, Acta 
Szeged, vol. 6 (1932), pp. 1-20. 
t See Radé, Acta Szeged, loc. cit. 


1933] THE PROBLEM OF PLATEAU 871 


such that E*=G*, F*=0. Secondly, take the harmonic functions #(u, 2), 
9(u, v), 2(u, v) which coincide with x*(u, v), y*(u, v), 2*(u, v) on w@+r*=1. 
Thus we derive from S a harmonic surface 


G: x = #(u,v), y = v), = v), 1. 


This operation, leading from S to §, and the simple relations between S 
and §, were the essential tools which I used to construct the approximate 
solutions in my previous work{, and these same tools will be used also in the 
present paper in setting up the iterative process. In order to get from S to §, 
we have to use a conformal map of S, and to get around the difficulties arising 
in this connection, in my previous work I approximated S by polyhedrons 
and referred to the theorem, proved by H. A. Schwarz, that polyhedrons do 
admit of conformal maps (in a properly generalized sense). Since my first 
publications on this subject, very substantial progress has been achieved in 
the theory of the conformal mapping of general surfaces.{ A theorem of 
McShane, concerned with conformal maps of saddle-shaped surfaces, permits 
us to deal directly with the surfaces which arise in the course of the iterative 
process and to present this process in a very compact and elegant way. 


1. PRELIMINARIES 


1.1. In this section, the definitions, lemmas and theorems referred to in 
the sequel will be stated for the convenience of the reader. 

A Jordan arc, in the xyz-space, is a one-to-one and continuous image of 
an interval a<i<b. A Jordan curve is the one-to-one and continuous image 
of the unit circle ~=cos 0, v=sin 0. 

If C; and C; are two Jordan arcs, then their distance d(C, C2), in the Fré- 
chet sense, is defined as follows. Denote by 7 a one-to-one and continuous cor- 
respondence between C; and C2, and let M(r) denote the maximum distance 
of corresponding points. The distance d(Ci, C2) is the greatest lower bound of 
M(r), for all possible choices of r. The distance of two Jordan curves is de- 
fined in the same manner. Convergent sequences are then defined in terms of 
the distance. 

1.2. Given two Jordan arcs C; and C2, suppose there is given a transfor- 

t See second foot note on p. 869. This passage from S to $, and the simple inequalities concern- 
ing the relations between S and §, have been subsequently used by J. Douglas and E. J. McShane 
also. See J. Douglas, Solution of the problem of Plateau, these Transactions, vol. 33 (1931), pp. 263-321, 
and The mapping theorem of Koebe and the problem of Plateau, Journal of Mathematics and Physics of 
the Massachusetts Institute of Technology, vol. 10 (1931), pp. 106-130; E. J. McShane, these Tran- 
sactions, vol. 35 (1933), pp. 716-733. 

t McShane, loc. cit. Further important contributions to the theory of the conformal maps of 


general surfaces are contained in several as yet unpublished papers by C. B. Morrey, which the author 
has had the privilege to see in manuscript. 


872 TIBOR RADO [October 


mation 7 which associates with every point of C, a definite point of C, (the 
existence of an inverse transformation is not required). T will be called a 
monotonic transformation of C; into a set on C, if the following conditions 
are satisfied. Whenever distinct points P;, Q1, R: on Ci are such that Q, is 
on the sub-arc with end points P;, Ri, then their images P2, 02, Re on Cz have 
the same relative positions; in case P2 and R; coincide, it is required that Q- 
also coincide with them. 

For two Jordan curves a monotonic transformation Tf of 
into a set on I'¥ is then defined as follows. There exists a triple of points A*, 
B*, C* on with distinct images A#*, such that the three non- 
overlapping arcs A*B*, BEC*, C*A* of are taken by T in a monotonic 
way into sets on the three non-overlapping arcs A*Bs*, B*C*, C*#A¥ of TH. 
The notion of a continuous monotonic transformation is then self-explana- 
tory. 

1.3. The term monotonic transformation has been chosen to suggest that 
such transformations have a number of properties in common with mono- 
tonic functions. As a consequence, a simple and important selection theorem 
of Helly, concerned with sequences of monotonic functions, generalizes im- 
mediately to sequences of monotonic transformations in the following 
manner.{ 

Let there be given a Jordan curve , and a sequence of Jordan curves 
I’,* converging toward a Jordan curve I* in the Fréchet sense. Given three 
distinct points a, b, c on , three distinct points A*, B*, C* on I*, and three 
distinct points A,*, B,*, C.* on such that A,*-A*, B,*—=B*, C*<C*. 
Consider any sequence of monotonic transformations T,, such that T, 
carries y into a set on I* and a, b, c into A*, B,*, C*. Then there exists a 
subsequence T,,, such that for every point p of y the sequence of the points 
P,,*, which correspond to » under T,,, converge toward a definite point P* 
on I*. The transformation which associates with p this limit point P* is 
a monotonic transformation T of y into a set on I'*. The limit transformation 
T carries a, b, c into A*, B*, C*. 

1.4. If a sequence of monotonic functions converges, in a closed interval, 
toward a continuous function, then the convergence necessarily is uniform.§ 
As observed by McShane in conversation with the author, this holds ob- 

t J. Douglas, in his work on the problem of Plateau, speaks of proper and improper parametric 
representations, and obtains the necessary facts by an interpretation on the torus. The term mono- 
tonic transformation, used in my own work, calls attention to the analogy with monotonic functions; 
the necessary facts appear then as immediate consequences of this analogy. 

t See references cited in second footnote on p. 869. 


§ See H. E. Buchanan and T. H. Hildebrandt, Note on the convergence of a sequence of functions 
of a certain type, Annals of Mathematics, vol. 9 (1908), p. 123. 


1933] THE PROBLEM OF PLATEAU 873 


viously for sequences of monotonic transformations also. Thus we have the 
following corollary to the preceding selection theorem: if the limit transfor- 
mation T is continuous, then T,,, converges uniformly toward T, the meaning 
of this assertion being too obvious to be explained. 

1.5. Suppose now we have a monotonic transformation T of the unit circle 
u=cos 6, »=sin 6 into a set on a Jordan curve I'*. Then T can be given by 
a set of equations 


T: = &(0), y = z = 


and, in analogy with monotonic functions, we have the following statements. 

(a) If T is not a one-to-one transformation, then there exists an arc o on 
u?+v?= 1, such that £(@), 7(@), ¢(@) all three reduce to constants on o. 

(b) For every 60, £(0), have definite one-sided limits 
no, no, 

(c) If, for a certain 0, we have not=nz, Sct =fr, then £(6), 
n(@), ¢(@) are continuous at 6. 

(d) The points of discontinuity of £9), (0), (0) form a denumerable set. 

1.6. A continuous surface S, of the topological type of the circular disc, is 
defined by a set of equations 


(1) y(u, 2(u, »), (u, 2) in R, 


where R is some Jordan region (that is, the set of points in and on a Jordan 
curve in the uv-plane), and x(u, v), y(u, v), 2(u, v) are continuous in R. Given 
then another such surface 


(2) S:2= »), = z(u, »), (u, v) in R, 


the distance d(S, S), in the Fréchet sense, is defined as follows. Let r denote 
a one-to-one and continuous correspondence between R and R, and denote 
by (u, v), (%, #) a couple of points corresponding under r. Denote by P the 
point of S corresponding to (u,v), by P the point of S corresponding to (a, 4), 
and by M(r) the maximum distance of P and P, for all choices of the couple 
(u, v), (#, 0). Then d(S, S) is the greatest lower bound of M(z), for all pos- 
sible choices of r. 

If d(S, S) =0, then S is considered as identical to S, and (1) and (2) are 
considered as parametric representations of the same surface. 

Convergent sequences of surfaces are defined in terms of the distance in 
an obvious manner. 

A continuous surface S is bounded by a Jordan curve I* if it admits of a 
representation (1) such that the boundary curve of R is taken in a topological 
way into I™*. 


874 TIBOR RADO [October 


1.7. A continuous surface will be called a polyhedron and will be denoted 
by §, if it admits of a parametric representation 


(3) Pix = y(u, = 2(u, »), (u, v) in R, 


with the following properties. R can be subdivided into a finite number of 
curvilinear triangles 6, - - - , dn, every one of which is carried by (3) in a 
topological way into a non-degenerate plane rectilinear triangle in the xyz- 
space. These rectilinear triangles will be denoted by Ay, - - - , An. It is further- 
more required that the boundary curve of R be carried by (3) in a topological 
way into a simple closed polygon p*, which will be called the boundary 
polygon of $. A representation (3) with these properties will be called a 
typical representation of $. 

1.8. A fundamental theorem, proved already by H. A. Schwarz, asserts 
the existence of conformal maps of polyhedrons, in the following sense. Given 
a polyhedron §,, there exists a representation which is typical in the sense of 
§1.7 and possesses the following additional properties. 

(a) The region R is the unit circle u?+2? <1. 

(b) The sides of the curvilinear triangles 6:,---, 5, are analytic arcs 
including their end points; none of these triangles has a zero angle. 

(c) x(u, v), y(u, v), 2(u, v) are analytic in the interior of 6, - - - , 5, and 
satisfy there the relations E=G, F =0. 

(d) Three points A, B, C, given arbitrarily on u?+v?=1, are carried into 
three points A*, B*, C* arbitrarily given on the boundary polygon p* of §. 

1.9. The sum of the areas of the triangles Ai, - - - , Am, defined in §1.7, 
is the area %(B) of B. The area A(S), in the Lebesgue sense, of a continuous 
surface S is then defined as follows. Consider a sequence of polyhedrons $, 
converging toward S in the Fréchet sense. %(S) is then the greatest lower 
bound of lim inf &(%,) for all choices of the sequence §,.f 

1.10. From the existence of conformal maps of polyhedrons McShane§ 
derived an important existence theorem for saddle-surfaces. To state this 
theorem, the following definitions are necessary. 

A function f(u, v), defined in u?+v? <1, satisfies condition (C) if the fol- 
lowing hold: 

(a) f(u, v) is continuous in u?+v? <1. 

t The reader may consult the beautiful book by Carathéodory, Conformal Representation (Cam- 
bridge University Press), Chapter VII. 

¢ For a systematic presentation of the theory of the area and for literature, see the author’s 
paper Uber das Flichenmass rektifizierbarer Flichen, Mathematische Annalen, vol. 100 (1928), pp. 
445-479. Quite recently, important contributions have been made to the theory by McShane (see 


for references McShane, loc. cit.) and by C. B. Morrey (in several as yet unpublished papers). 
§ McShane, these Transactions, ioc. cit. 


1933] THE PROBLEM OF PLATEAU 875 


(b) f(u, v) is, for almost every value of u, an absolutely continuous func- 
tion of v, and for almost every value of v an absolutely continuous function 
of u. 

(c) The Dirichlet integral [/(f2 +/2), taken over u?+0? <1, is finite. 

1.11. A function f(u, v), continuous in a Jordan region R, is monotonic 
there if for every domain D (connected open set) in R it is true that m,< 
f(u, v) SM, in D, where m,, M, denote the minimum and maximum respec- 
tively of f(u, v) on the boundary of D. 

1.12. A continuous surface (1) is called a saddle-surface if x(u, v), y(u, 2), 
z(u, v) are monotonic. This property is independent of the parametric repre- 
sentation. 

1.13. The theorem of McShane reads then as follows. Given a continuous 
surface 


(4) S: x(u, y(u, 2(u, »), u? + v? 1, 


suppose that 

(a) S is a saddle-surface; 

(b) %(S) is finite; 

(c) the equations (4) define a continuous monotonic transformation of 
u?+-v? =1 into a Jordan curve I™*. 

Then there exists a representation of S, 


(S) Six = v), 7 = »), 2(u, v), 1, 


which has the following properties. 

(a) #(u, v), v), 2(u, v) satisfy condition (C) of §1.10. 

(8) E=G, F =0 almost everywhere in u?+ 0? <1. 

(vy) The equations (5) define again a continuous monotonic transforma- 
tion of u?+v?=1 into I™*. 

(6) Three points A, B, C, given arbitrarily on u?+0?=1, are carried by 
(5) into three points A*, B*, C* given arbitrarily on I'*. 

(€) &(S) is given by the usual integral formula, which reduces, on ac- 
count of (8), to 


a(S) = 4 f + G)dudo. 


1.14. Given a Jordan curve I™ in the xyz-space, the problem of Plateau for 
I* will be stated as follows. Determine three functions x(u, v), y(u, v), 2(#, v) 
with the following properties. 

(a) x(u, v), y(u, v), 2(u, v) are continuous in u?+v?<1, harmonic in 
u?+v?<1, and 


Tt McShane, these Transactions, loc. cit. 


876 TIBOR RADO [October 


(b) satisfy in u?+0?<1 the equations E=G, F=0. 

(c) The equations x=x(u, v), y=y(u, v), z=2(u, v) carry u?+v?=1 in 
a topological way into I*. 

1.15. Given, in a domain D, three functions x(u, v), y(u, v), 2(u,v), we 
shall say that they form a triple of conjugate harmonic functicns, if they are 
harmonic and satisfy the equations E=G, F=0 in D. If one of the three 
functions, say 2(u, v), vanishes identically, then x(u, v) and y(u, v) are ob- 
viously conjugate harmonic functions in the sense used in theory of func- 
tions. This analogy between minimal surfaces on the one hand and analytic 
functions of a complex variable w=u-+7v on the other hand has always been 
of fundamental importance in the theory of minimal surfaces. 

We shall need the following two lemmas.f 

1.16. Suppose we have, in u?+v? <1, a triple of conjugate harmonic func- 
tions x(u, v), y(u, v), 2(u, v) which remain continuous on u?+2?=1, and all 
three reduce to constants on a certain arc of u?+v?=1. Then x(u, v), y(u, 2), 
2(u, v) reduce to constants identically. 

1.17. Let there be given, in a sector 0<u?+v? <r’, 0<arc tan (v/u) <a, 
a triple of conjugate harmonic functions x(u, v), y(u, v), 2(u, v). Suppose that 
these functions remain continuous on v=0, 0<u<r, and that x(u, 0), 
y(u, 0), 2(u, 0) approach definite finite limits xo, yo, zo for u++0. Then 
x(u, y(u, 2(u, v)—>20 if (uw, v) 0) in any subsector 


v 
0S arctan— SB<a. 
u 


1.18. Let there be given, on the unit circle u=cos 0, v=sin 0, three func- 
tions £(0), n(@), Suppose that £(0), 7(@), ¢(@) are summable and that 
at a certain 0) they have definite finite one-sided limits &j, >, nat, nev, Sa, 
tz. Suppose that the harmonic functions x(u, v), y(, v), 2(u, v), obtained by 
means of the Poisson integral formula by using £&(6), 7(@), ¢(@) as boundary 
functions, constitute a triple of conjugate harmonic functions in u?+0? <1. 
Then 

= tr, = m0, Sot = So. 


This lemma, due to J. Douglasf, can be obtained as an immediate con- 
sequence of the generalized Lindeléf theorem, stated in §1.16.§ 
1.19. Given, in the xyz-space, a Jordan curve I, we shall consider in the 


t For proofs and literature concerning the lemmas in 1.16 and 1.17, which generalize classical 
theorems of Schwarz and of Lindeléf, see E. F. Beckenbach and T. Rad6, Subharmonic functions and 
minimal surfaces, these Transactions, vol. 35 (1933), p. 648-661. 

t Loc. cit. in first footnote on p. 871. 

§ Beckenbach and Rad6, loc. cit. 


1933] THE PROBLEM OF PLATEAU 877 


sequel an approximate form of the problem of Plateau so often that it is 
convenient to have a symbol for it. Given three distinct points A, B, C on 
u?+v? =1, three distinct points A*, B*, C* on I'*, and an e>0. We shall then 
denote by P(I'*; A, B, C, A*, B*, C*; €) the following problem.t Determine 
three functions x(u, v), y(u, v), 2(u, v) with the following properties. 

(a) x(u, v), y(u, v), 2(u, v) are continuous in u?+vs?<1, harmonic in 
u?+v? <1, and 

(b) satisfy the relations 


[flrise 


the integrals being taken over u?+-0? <1. 
(c) The equations 


(6) x = x(u,v), y = y(u, 0), = 2(u, v), u? + = 1, 


define a continuous monotonic transformation (see §1.2) of u?+-v?=1 into 
a (not prescribed) Jordan curve I, such that the distance of I'* and I* is 
Se. Finally, A, B, C are taken by (6) into three distinct points A*, B*, 
C* on I*, such that the distances A*A*, B*B*, C*C* are all three <e. 

1.20. It is important to observe that the problem P(I*; A, B, C, A*, 
B*, C*; 0) is identical to the problem of Plateau, as stated in §1.14. Indeed, 
if e=0, then condition (b) in §1.19 reduces to E=G, F=0 in u?+0?<1, 


since E, F, G are continuous (even analytic). We have to see what happens 
on the boundary. If e=0, then condition (c) in §1.19 only requires that the 
equations (6) define a continuous and monotonic transformation 7 of 
u?+v?=1 into I*. We must show that T is a one-to-one transformation. 
However, since E=G, F =0 has already been verified, the one-to-one charac- 
ter of T follows directly from §1.16. 


2. THE ITERATIVE PROCESS 


2.1. Let there be given, in the xyz-space, a Jordan curve I'*. Take three 
distinct points A, B, C on u?+v?=1, and three distinct points A*, B*, C* 
on I*. Suppose there is given a triple of functions xo(u, v), yo(u, v), Zo(w, 2) 
with the following properties. 

(a) xo(u, v), v), Zo(u, v) are continuous in and harmonic 
in u?+0?<1. 

(8) The equations 


= xo(u, »), Zo(u, »), +o? = 1, 


¢ The approximate form of the problem of Plateau has been introduced in the author’s papers in 
the Annals of Mathematics and Mathematische Zeitschrift, loc. cit. 


878 TIBOR RADO [October 


define a continuous monotonic transformation of u?+v?=1 into I*, such that 
A, B, C are taken into A*, B*, C*. 

(y) The area of the surface 
(7) Ho: x = Xo(u, = Yo(u, v), 2 = v), u? + 1, 


is finite. 
On account of condition (a), the area of So is then given by 


U(Go) = ff (EoGo — 


where Eo, Fo, Go are the first fundamental quantities relative to the repre- 


sentation (7). 
2.2. The preceding assumptions being satisfied, the iterative process runs 


as follows. On account of conditions (a), (8), (vy) the theorem of McShane 
(§1.13) applies to So, and we have therefore a representation 


(8) Ho: x= »), Fo(u, Zo(u, 2), 
which satisfies the following conditions. 

(&) Zo(u, v), Fo(u, v), Z0(u, v) satisfy condition (C) of §1.10 in w?+v?<1, 
and also satisfy there the relations Ey)>=Go, Fo>=0 almost everywhere (Eo, 
Fo, Go are the first fundamental quantities relative to the representation (8)). 


(8) The equations (8) define, for w?+v? = 1, a continuous monotonic trans- 
formation of u?+2?=1 into I, such that A, B, C are carried into A*, B*, C*. 


(7) On account of (&), (Go) is given by 
Denote then by x:(u, 2), y:(u, 21(u, v) the harmonic functions coinciding 
with Zo(u, v), Fo(u, v), Zo(u, v) on and define the surface by 
Giz = 0), y = v), = v), +o? S 1. 


Let us first verify that §, has also a finite area. Since a harmonic function 
with given boundary values minimizes the Dirichlet integralf we have the 


inequalities 


ff ato s ff so, 
f f (stu + 21) f f + 200), 


¢ See Hurwitz-Courant, Funktionentheorie (Berlin, Springer, 1922), p. 335, Hilfsatz II. The proof, 
as given there, covers the present case on account of condition (C) stated in §1.10 (see McShane, loc. 


cit.). 


1933] THE PROBLEM OF PLATEAU 


and hence, by addition, 


(10) ff (Ei+G.) s f (Eo + Go), 


the integrals being taken over u?+0? <1. Since 


it follows from (9) and from (10) that the integral of (£,Gi:— F?)"? taken over 
u?+v*<1, that is to say, the area of §,, is finite. For further use we note 
the obvious inequalities 


= ff ff s— ff eaten 
(11) —ff (Eo + Go) = A(Go) = ff — 


1 
f f — f f (Eo + Go). 


Since the area of §, is finite, we can repeat the procedure by which we 
derived §; from §o. We obtain in this way a surface $2, to which we can 
apply the same procedure, and so on indefinitely. We obtain in this way a 
sequence of surfaces 


(12) Dna: = 0), = Ya(U, 0), = v), + S 1, 


where x,(u, Yn(u, 2n(u, satisfy the conditions (a), (8), (y) listed in 
§2.1 (the points A, B, C, A*, B*, C* are kept fixed). 
2.3. We put 


We assert that €,—0. In other words, xn(u, 0), yn(u, Zn(u, constitute a 
solution of the problem P(I'*; A,B,C,A*, B*, C*; €,), where ¢,—0. Further- 
more, we have &(H,) S A(Go), where GS, and Go are given by (12) and (7) 
respectively. In other words, every one of the harmonic surfaces $n has an 
area S the area of the initial harmonic surface Qo. 

2.4. To prove the preceding assertions, let us put 


= 6.) = -— ff 


t If a, b are two real numbers, then max (a, b) denotes the greater one of the two numbers (or 
their common value, if they are equal). 


879 


880 TIBOR RADO 


Then (11) yields the relations 
S31 S S Mo. 


Since $,4: is derived from §, by the same procedure as §; from $o, we have 
quite generally for »=0, 1, 2, - - - the relations 

From (13) it follows that the sequences %o, %1, We, - - - and Yo, $1, Se, - 
are both descending. Since all the terms are 20, both sequences are conver- 
gent, and from (13) it follows then immediately that both sequences converge 
toward the same limit. Bence if we put ¢,=3,— An, then 


0 on 
Since %o, %1, We, - - - is a descending sequence, we have also 


(14) S Ao. 


Suppose in the sequel that »21. From the relations 


(15) = f (E,Gn — f E.G. < f f (E, + 


we infer 


6) 


ff Ex! Gn” S Sn — Un = On 


Furthermore, since 


Fa | = + — 


we infer, from (15), (16), and from the inequality of Schwarz, that 


an ff + x a) 
(Bn + — Wa) = + 


On account of (13) we have, since »=1, the inequalities U,<3,< Wo. Hence, 
from (17), 


(18) f f 


From (16) and (18) it follows that 


1/2_1/2 2.1/2,1/2 
n G,, ] 


[October 


THE PROBLEM OF PLATEAU 


Since Wp is finite and ¢,—0, this proves that €,—0. The last part of the state- 
ment in §2.3 has already been proved by (14). 

2.5. We apply the preceding results first to the following situation. Let 
there be given, in the xyz-space, a simple closed polygon p*, and let there 
also be given a polyhedron $ bounded by p*. Given further three distinct 
points A, B, C on u?+v?=1, three distinct points A*, B*, C* on p*, and given 
an e>0. 

Then there exists a solution 


Six= 7? y(u, v), »), 1, 


of the problem P(p*; A, B, C, A*, B*, C*; €) such that UA(S) S A($P). In par- 
ticular (disregarding the condition concerned with the areas), the problem 
P(p*; A, B, C, A*, B*, C*; €) has a solution for every choice of A, B, C, A*, 
PC. 
To see this, let 


be an isothermic representation of , such that A, B, C are taken into A*, 
B*, C*. Let xo(u, v), yo(u, v), Z0(w, v) be the harmonic functions coinciding 
with %(u, v), ¥(u, v), 2(u, v) on u®+v?=1, and denote by Gp the surface 


Ho: x = Xo(u, v), y = yo(u, v), = Zo(u, v), +o? <1. 


On account of the minimizing property of harmonic functions we have again 


(19) ff é+o. 


From the inequalities [{(Eo+Go)/2, from the 
equations E=G, F =0 and from (19) we infer that 


(20) s— ff (Eo + Go) = U(P). 


Thus the area of §o is finite. Hence we can start up the iterative process, 
beginning with $o. There results a sequence of solutions 
Gn: = 0), = 0), = v), + S 1, 


of the problems P(p*; A, B, C, A*, B*, C*; €,), where €,—0 and (G,) 
< A(Go). Hence, on account of (20), A(G,) S A(P) for every n, while e, will 
be <e for m large enough. Thus, for m large enough, §, can be used as the 


1933] 881 


882 TIBOR RADO |October 


surface S whose existence has been asserted at the beginning of the present 
section. 

2.6. Consider next a Jordan curve I* in the xyz-space. Given three distinct 
points A, B, C on u?+v?=1, three distinct points A*, B*, C* on I*, and an 
e>0. Then there exists a solution of the problem P(T*; A, B, C, A*, B*, C*; ). 

To see this, observe first that we can approximate I, in the sense of 
Fréchet, by simple closed polygons. Hence we have a polygon p* such that 
d(p*, '*) <e/2. From the definition of the distance of two curves it follows 
then that we have on p* three distinct points A*, B*, C* such that the 
distances A*A*, B*B*, C*C* are all three <e/2. On account of §2.5, the 
problem P(p*; A, B, C, A*, B*, C*; ¢/2) has a solution, and this solution is 
clearly a solution of the problem P(I'*; A, B, C, A*, B*, C*; €), as follows 
immediately from the definition of this problem (see §1.19) 

2.7. Consider finally a Jordan curve I'* which bounds some continuous 
surface, of the topological type of the circular disc, with a finite area. Then 
the greatest lower bound a(I*) of the areas of all such surfaces (bounded by 
I) is finite. Given three distinct points A, B, C on » ?+0?=1, three distinct 
points A*, B*, C* on I'*, and an e>0. Then there exists a solution 


S:z= x(t, »), y(u, 2(u, »), 1, 


of the problem P(I*; A, B, C, A*, B*, C*; ©), such that X(S) < a(I*) +e. 
To see this, observe first that we have, on account of the definition of 
a(I'*), a continuous surface So, bounded by I*, such that 


(21) M(So) < +: 


On account of the definition of the area, we have then a polyhedron , such 
that 


(22) B) MB) < +> - 


On account of the definition of d(So, $),the distance of the boundary polygon 
p* of $ and of I'* is <e¢/2. Hence we have three distinct points A*, B*, C* 
on p* such that the distances A*A*, B*B*, C*C* are all three <e¢/2. On ac- 
count of §2.5, we have then a solution S of the problem P(p*; A, B, C, 
A*, B*, C*; €/2) such that &(S) < A(P). This S is then clearly a solution of 
the problem P(I*; A, B, C, A*, B*, C*; €) and it also satisfies the inequality 
< a(T*) on account of (21) and (22). 


THE PROBLEM OF PLATEAU 


3. A SELECTION THEOREM 


3.1. Let I'* be a Jordan curve in the xyz-space. Choose three distinct 
points A, B, C on u*?+v?=1 and three distinct points A*, B*, C* on I*. 
Suppose we have a sequence ¢,>0, €,—0, such that the problem P(I*; 
A, B, C, A*, B*, C*; €,) has a solution 


(23) Sn: = v), = Yn(U, 2 = v), + 1, 


for nm=1, 2,3,---. 
Then the sequence (23) contains a subsequence Sn, such that xn,(u, v), 
Yn,(U, V), converge uniformly in u?+v? <1. The limit surface 


S: 2 = x(u, v), y = y(u, v), 2 = 2(u, v), +0? <1, 


is a solution of the problem P(T*; A, B, C, A*, B*, C*; 0). If lim inf A(S,,) 
is finite, then A(S) is also finite and satisfies the inequality 


< lim inf X(S,,). 


3.2. This theorem includes, as special cases, a number of selection the- 
orems used previously in the literature.t The proof of the theorem runs as 
follows. If we it, for the sake of clarity, 


£,(0) = 0, sin 0), = ya(cos 0, sin ¢,(0) = z,(cos 8, sin 4), 
then the equations 
x = & (0), y = mn(0), 2 = 


define by assumption a sequence of monotonic transformations, and we can 
apply to this sequence the generalization of the selection theorem of Helly 
(see §1.3). There exists therefore a subsequence 


x = ¥ = 2 = 


which converges everywhere on the unit circle «=cos 0, v=sin 6. The limit 
functions £(6), n(@), ¢(@) define a monotonic transformation 


(24) T: x = &(0), y = 2 = $(0) 


7 R. Garnier, Sur le probléme de Plateau, Annales Scientifiques de l’Ecole Normale, vol. 45 
(1928), pp. 53-144; T. Rad6, Some remarks on the problem of Plateau, Proceedings of the National 
Academy of Sciences, vol. 16 (1930), pp. 242-248, and Annals of Mathematics and Mathematische 
Zeitschrift, loc. cit.; J. Douglas, loc. cit. in the first foot note on p. 871. 

In the development of my own work, I was guided by the analogy with a theorem, stated by 
Carathéodory, concerning the conformal maps of variable plane Jordan regions (see R. Courant, 
Uber eine Eigenschaft der Abbildungsfunkti bei konformen Abbildung, Géttinger Nachrichten, 


1914 and a notice in 1922; T. Rad6é, Sur la représentation conforme de domaines variables, Acta Szeged, 
vol. 1 (1923)). 


1933] 883 

| 

! 


882 TIBOR RADO |October 


surface S whose existence has been asserted at the beginning of the present 
section. 

2.6. Consider next a Jordan curve I* in the xyz-space. Given three distinct 
points A, B, C on u?+v?=1, three distinct points A*, B*, C* on I*, and an 
e>0. Then there exists a solution of the problem P(T*; A, B, C, A*, B*, C*; €). 

To see this, observe first that we can approximate I, in the sense of 
Fréchet, by simple closed polygons. Hence we have a polygon p* such that 
d(p*, I'*) <¢/2. From the definition of the distance of two curves it follows 
then that we have on p* three distinct points A*, B*, C* such that the 
distances A*A*, B*B*, C*C* are all three <¢/2. On account of §2.5, the 
problem P(p*; A, B, C, A*, B*, C*; €/2) has a solution, and this solution is 
clearly a solution of the problem P(I'*; A, B, C, A*, B*, C*; €), as follows 
immediately from the definition of this problem (see §1.19) 

2.7. Consider finally a Jordan curve I'* which bounds some continuous 
surface, of the topological type of the circular disc, with a finite area. Then 
the greatest lower bound a(I™*) of the areas of all such surfaces (bounded by 
I*) is finite. Given three distinct points A, B, C on u?+v?=1, three distinct 
points A*, B*, C* on I*, and an e>0. Then there exists a solution 


of the problem P(I*; A, B, C, A*, B*, C*; €), such that U(S) S a(T*) +e. 


To see this, observe first that we have, on account of the definition of 
a(I'*), a continuous surface So, bounded by I*, such that 


(21) < a(I'*) + = 


On account of the definition of the area, we have then a polyhedron f, such 
that 


(22) d(So, B) < > < A(So) 


On account of the definition of d(So, f),the distance of the boundary polygon 
p* of B and of I'* is <¢/2. Hence we have three distinct points A*, B*, C* 
on p* such that the distances A*A*, B*B*, C*C* are all three <e€/2. On ac- 
count of §2.5, we have then a solution S of the problem P(p*; A, B, C, 
A*, B*, C*; €/2) such that &(S) < A(P). This S is then clearly a solution of 
the problem P(I*; A, B, C, A*, B*, C*; €) and it also satisfies the inequality 
< a(T*) +e, on account of (21) and (22). 


THE PROBLEM OF PLATEAU 


3. A SELECTION THEOREM 


3.1. Let I'* be a Jordan curve in the xyz-space. Choose three distinct 
points A, B, C on u?+v?=1 and three distinct points A*, B*, C* on I™*. 
Suppose we have a sequence €,>0, €,—0, such that the problem P(I*; 
A, B, C, A*, B*, C*; €,) has a solution 


(23) Spi = v), y = Yn(u, v), 2 = 2,(u, v), + 0? <1, 


for n=1, 2, 3, - 
Then the sequence (23) contains a subsequence Sn, such that xn,(u, 2), 
0), 0) Converge uniformly in u?+0? <1. The limit surface 


= x(u,v), y = y(u, 0), = 2(u, v), 1, 


is a solution of the problem P(I*; A, B, C, A*, B*, C*; 0). If lim inf A(S,,) 
is finite, then U(S) is also finite and satisfies the inequality 


%(S) < lim inf A(S,,). 


3.2. This theorem includes, as special cases, a number of selection the- 
orems used previously in the literature.t The proof of the theorem runs as 
follows. If we put, for the sake of clarity, 


&,(0) = %,(cos 8, sin 7n(0) = yn(cos 8, sin 0), = 2,(cos 8, sin @), 
then the equations 
x = = m(6), = 


define by assumption a sequence of monotonic transformations, and we can 
apply to this sequence the generalization of the selection theorem of Helly 
(see §1.3). There exists therefore a subsequence 


= = 2 = (8) 


which converges everywhere on the unit circle «=cos 0, v=sin 6. The limit 
functions £(@), n(@), ¢(@) define a monotonic transformation 


(24) T: x = &(0), y = 2 = $(0) 


¢ R. Garnier, Sur le probléme de Plateau, Annales Scientifiques de Il’Ecole Normale, vol. 45 
(1928), pp. 53-144; T. Radé6, Some remarks on the problem of Plateau, Proceedings of the National 
Academy of Sciences, vol. 16 (1930), pp. 242-248, and Annals of Mathematics and Mathematische 
Zeitschrift, loc. cit.; J. Douglas, loc. cit. in the first foot note on p. 871. 

In the development of my own work, I was guided by the analogy with a theorem, stated by 
Carathéodory, concerning the conformal maps of variable plane Jordan regions (see R. Courant, 
Uber eine Eigenschaft der Abbildungsfunkti bei konformen Abbildung, Géttinger Nachrichten, 
1914 and a notice in 1922; T. Radé, Sur la représentation conforme de domaines variables, Acta Szeged, 
vol. 1 (1923)). 


1933] 883 
| 


884 TIBOR RADO [October 


of the unit circle «=cos 0, y=sin 0 into a set on I*, such that A, B, C are 
taken into A*, B*, C*. Denote by x(u, v), y(u, v), 2(u, v) the harmonic func- 
tions obtained by means of the Poisson integral formula, using £(), (6), 
(0) as boundary functions.t We are going to discuss the surface 


(25) = x(u, 0), y = y(u, 0), 2 = 2(u, »), 


which is defined, for the time being, only in u?+2? <1. 

3.3. It follows from the Poisson integral formula that the harmonic 
functions %n,(u, 2), 2), 2) and all of their partial derivatives con- 
verge, in u?+v? <1, toward x(u, v), y(u, v), 2(u, v) and their corresponding 
partial derivatives, the convergence being uniform in every concentric circle 
u*+v* <r? <1. Consequently we have 


(27) ff 
(28) f Fs) f 


for every r such that 0 <r <1, the symbol (r) indicating that the integrals are 
taken over u*+-0?<r?. We also have 


(r) (1) 


where the last line is, of course, to be considered only if &(S,,) is finite. From 
(26) to (31) it follows then, on account of ¢,,—0, that 


32 — = 0, F| =0, 
(32) f x ) f J Jl 
(33) f (EG — < lim inf %(S,,). 

(r) 


From (33) it follows, in case lim inf %(S,,) happens to be finite, for r—1 that 


Tt On account of §1.5, (d), the functions £(6), 7(6), ¢(6) are integrable even in the Riemann sense. 


| 


1933] THE PROBLEM OF PLATEAU 885 


(34) = f lim inf 


From (32) it follows, since Z, F, G are continuous (and even analytic), that 
E=G, F=0 for u?+v? <r’. Since r is arbitrary, it follows that 


(35) E=G,F =0Oinw’+2 <1. 


3.4. Since (24) is a monotonic transformation, £(6), n(@), ¢(@) have the 

properties listed in §1.5. On account of (35), it follows therefore from the 
lemma of Douglas (see §1.18), in connection with §1.5, (c), that £(6), n(@), 
¢(@) are continuous on the whole unit circle u=cos 6, v=sin 6. Consequently 
the harmonic functions x(u, v), y(u, v), z(u,v) remain continuous on 
(that is to say, they are continuous in u?+v?<1), and we have 
(36) a(u, v) = £(6), y(u, v) = (6), 2(u, v) = $(6) on + = 1, 
From this it follows that the equations (24) carry u?+2?=1 in a one-to-one 
way into I'*. Otherwise there would exist (see §1.5 (a)) an arc o of u?+0?=1, 
such that £(6), 7(@), ¢(@) all three reduce to constants on ¢. On account of (36) 
and (35) it would then follow from §1.16 that x(u, v), y(u, v), 2(, v) and con- 
sequently (cf. (36)) also (6), (0), £(@) reduce to constants identically. This 
contradicts the fact that the equations (24) carry the points A, B, C into 
three distinct points A*, B*, C*. 

3.5. Since it has been established in §3.4 that £(6), (6), ¢(@) are contin- 
uous, it follows from §1.4 that £,,(@), 7n,(0), {n,(0) converge uniformly toward 
£(0), (0), ¢(6). From the principle of maximum it follows then that the 
harmonic functions x,,(u, Yn,(u, Y), Zn,(%, 0) converge uniformly toward 
x(u, v), y(u, v), 2(u, v) in u?-+v?<1. This completes the proof of the theorem 
stated in §3.1. 

4. APPLICATIONS 

4.1. Let there be given, in the xyz-space, a Jordan curve I*. Given three 
distinct points A, B, C on u?+v?=1, and three distinct points A*, B*, C* on 
I'*. Then the problem P(I'*; A, B, C, A*, B*, C*; 1/m) is solvable for every 
positive integer m (see §2.6). Denote by 
(37) Sq: % = v), = Yn(U, 0), = v), + 0? S 1, 


a solution of this problem. On account of the selection theorem (see §3.1), a 
properly chosen subsequence of (37) will converge uniformly in u?+2?<1, 
and the limit surface solves the problem P(I'*; A, B, C, A*, B*, C*; 0), that 
is to say, the problem of Plateau for I'* (see §1.20). In other words: the 
problem of Plateau is solvable for every Jordan curve.t 


Tt This result was first obtained by J. Douglas, loc. cit. in the first foot note on p. 871. 


j 


886 TIBOR RADO [October 


4.2. Suppose now that I'* bounds some continuous surface, of the top- 
ological type of the circular disc, with a finite area. Then the greatest lower 
bound a(I*) of the areas of all continuous surfaces, of the topological type 
of the circular disc and bounded by I™, is finite. The problem P(I*; A, B, C, 
A*, B*, C*; 1/m) has then (see §2.7) a solution 


Sy: = 0), y = Yn(U, v), = v), + v? S 1, 
such that 
(38) M(S,) < + 1/n. 


On account of the selection theorem of §3.1, a properly chosen subsequence 
S,, will converge, uniformly in u?+0? <1, toward a solution S of the problem 
P(T*; A, B, C, A*, B*, C*; 0), such that 


(39) < lim inf X(S,,). 


From (38) and (39) it follows that &(S) < a(I'*). On the other hand we have 
also &(S) = a(I'*), since S is a continuous surface, of the type of the circular 
disc, bounded by I'*. Hence %(S) = a(I'*). In other words: if S bounds some 
continuous surface (of the topological type of the circular disc) with a finite area, 
then there exists a minimal surface, bounded by I'*, whose area is a minimum 
with respect to all continuous surfaces, of the topological type of the circular disc, 
bounded by ~ 

4.3. Suppose now that I* is a simple closed polygon p*. Then, first of 
all, the construction used in §2.5 shows that we have a harmonic surface 


Ho: z= Xo(u, yo(u, v), Zo(u, 2), 1, 


bounded by p* and having a finite area. Starting with any such surface Ho, 
the iterative process yields a sequence 


(40) X = v), = Yn(U, v), = v), + 0? S 1, 


and &, solves the problem P(p*; A, B, C, A*, B*, C*; €,), where &:, €@,---, 
€n, °° * are positive numbers such that ¢,—0. On account of the selection 
theorem, the sequence (40) contains a uniformly convergent subsequence, 
and the limit surface solves the problem of Plateau for p*. It can easily be 


t This result was first obtained by the author. See Rad6é, The problem of the least area and the 
problem of Plateau, Mathematische Zeitschrift, vol. 32 (1930), pp. 763-796. Subsequent proofs have 
been given by J. Douglas, Solution of the problem of Plateau, these Transactions, vol. 33 (1931), pp. 
263-321, and, quite recently, by E. J. McShane, Parametrizations of saddle surfaces, with application 
to the problem of Plateau, in the current volume of these Transactions, pp. 716-733. All the proofs 
of this result known at the present time depend on the passage from S to D (see the Intoduction) 
first used in the author’s paper On Plateau’s problem, Annals of Mathematics, (2), vol. 31 (1930), 
pp. 457-469. 


1933] THE PROBLEM OF PLATEAU 887 


shown that if we choose the initial harmonic surface §o in all possible ways, 
then we obtain in this way all the solutions of the problem of Plateau. Indeed 
let 
S:2 = x(u,v), y = y(u, v), 2 = 2(u, v), +0? <1, 

be any solution of the problem as stated in §1.14. Then x(, v), y(u, 0), 2(u, v) 
are harmonic in u?+v? <1, and thus we can use S as the initial harmonic sur- 
face of the iterative process, provided the area of S is finite. But this is indeed 
so, on account of a theorem of Carleman. According to Carleman, every min- 
imal surface, of the topological type of the circular disc, satisfies the iso- 
perimetric inequality 


1 
As—L 


where % is the area of the surface and L is the length of its perimeter.f In 
our case, the perimeter is a polygon, and thus LZ and consequently is finite. 
Thus S can be used as the initial harmonic surface §o of the iterative process, 
and it is then obvious, on account of condition (b) of the problem of Plateau 
(see §1.14), that all the harmonic surfaces §, coincide with S. 

4.4. Thus the fact that the iterative process yields all the solutions of the 
problem of Plateau appears as trivial. On the other hand, I feel that this fact 
constitutes the specific advantage of the method. My own previous work, 
as well as the work of Douglas and that of McShane, yielded a solution with 
a minimum area, and therefore certainly not the general solution. 

The iterative process might be considered therefore as a contribution to 
the problem of determining the totality of the solutions of the problem of Plateau. 
If the given curve has a simply covered convex curve as its parallel or central 
projection upon some plane, then the solution of the problem is unique.f 
As far as I know, the exact number of the solutions has not yet been deter- 
mined in any other case. 


t See, for a simplified proof and literature, E. F. Beckenbach, The area and boundary of minimal 
surfaces, Annals of Mathematics, vol. 33 (1932), pp. 658-664. Further developments on the isoperi- 
metric inequality are contained in a joint paper by E. F. Beckenbach and the present author, Sub- 
harmonic functions and surfaces of negative curvature, these Transactions, vol. 35 (1933), pp. 662-674. 

t See Rad6, Acta Szeged, vol. 6 (1932), pp. 1-20. 


STATE UNIVERSITY 
Co.tumBus, OHIO 


EFFECTS OF LINEAR TRANSFORMATIONS ON THE 
DIVERGENCE OF BOUNDED SEQUENCES 
AND FUNCTIONS* 


BY 
JOSEPH LEV 


1. Introduction. The transformation 


Yn = Kati, 
where {x;} is a sequence of complex elements and the K,,; are complex 
numbers, has been widely studied, and the conditions which must be ful- 
filled by the K,,,; in order that the property of convergence of the sequence 
may remain invariant were given by Schur [1].t In recent studies by Hur- 
witz [2, 3] and Knopp [4] modes of measuring the divergence of bounded 
sequences were given, and the conditions on the K,,,; were found under which 
the divergence of the sequence {y,} is no greater than that of {x,}. 

In this paper the effects of the transformations will be investigated with 
fewer restrictions on the K,,; than those imposed by earlier writers. The 
problem will be approached by means of the new concept of the limit circle 
defined as follows: 

The limit circle of a bounded sequence of complex elements is the (unique) 
circle of least radius which contains within or on its boundary the limit points 
of the sequence. 

The limit circle of a bounded function F(y) of the complex variable y as 
y—é (finite or infinite) is analogously defined in terms of the limit points of 
F(y) as y—€; this concept will be used in the study of transformations of 
sequences and functions into functions. 

2. Sequence to function transformations. Instead of the transforma- 
tion mentioned in the introduction we shall study the following more general 
transformation S. Let T be a set of points in the complex plane having a 
limit point ¢» (finite or infinite) not belonging to T. We shall speak of a point 
t in T as being sufficiently advanced if for some 5>0, |#—to| <5 when fy is 
finite, or | 1/t] <é when fy is infinite. Then let K;(#) be a set of complex num- 
bers defined for i=1, 2, - - - , and each ¢in T, and such that 

* Presented to the Society, December 27, 1932; received by the editors November 15, 1932, and 


after revision, May 17, 1933. 
t Here and below numbers in square brackets refer to the bibliography at the end of the paper. 


888 


LINEAR TRANSFORMATIONS AND DIVERGENCE 


S: g(t) = 


is defined for each ¢ in T. We shall refer to the limit points of g(#) as tt. 
simply as the limit points of g(). 

We shall now prove 

THEOREM 2.1. Let {x,} be a bounded sequence of complex elements. If the 
satisfy the conditions 


(2.11) lim K;,(t) = k;, for each i, 


bbe 


(2.12) Kit)| < M, 


tel 


for all sufficiently advanced t, M a constant, then the quantities a the center and 
D the radius of the limit circle of the function >;_,K;(t), 


A=a-— B= C=limsup >>| Kit) — 
t=1 i=1 


exist, and the limit points of g(t) lie in the circle of center H=Ah+B, and 
radius R=Cr+D|h|, where h is the center and r the radius of the limit circle 
of {xn}. 

The existence of a, A, B, C, Dis easy to establish and the details will not 
be given here. For the remainder of the proof write the inequality 


+ ¥ [kw Kl -| — al-| al. 


imp+1 


Choose e>0, and # so great that for all i>p|2;—h| <r+e. Then 
lim sup | g(t) — — 
bbe 
< o(1) + (r + ¢): lim sup Ki) — | +] lim sup Kilt) al, 
be i=l be 


and since the inequality holds for all e>0, the theorem follows. 
The remaining theorems of this section will be seen to be in part conse- 
quences of Theorem 2.1. Notations already introduced will be freely used, 


889 
| 


890 JOSEPH LEV [October 


and {x,} will be taken bounded throughout the discussion. In particular h 
and r will in each case by taken to depend on {x,}. 

Theorem 2.1 easily yields the sufficiency of the following theorem of 
Schur [1]. 


THEOREM 2.2. In order that S may be such that lim:.:, g(t) exists whenever 
h=r=0, it is necessary and sufficient that the K;(t) satisfy (2.11), and (2.12). 


The theorem just stated can be generalized to 


THEOREM 2.3. Let N be a real non-negative constant. In order that the limit 
points of g(t) shall lie in a circle of radius N|h|, whenever r=0, it 7s necessary 
and sufficient that the K;(t) satisfy the conditions (2.11), (2.12), and DSN. 


The sufficiency follows from Theorem 2.1, and the necessity of the first 
two conditions from Theorem 2.2. For the necessity of the condition DSN 
we need only consider the special case x, =1 (m=1, 2,---). 

To supplement Theorem 2.3 we can give the following theorem which 
takes into account the position of the limit points of g(é). 


THEOREM 2.4. Let N be a real non-negative constant. In order that S may be 
such that the limit points of g(t) shall lie in a circle of center h and radius N| hl, 
whenever r =O, it is necessary and sufficient that the K;(t) satisfy the conditions 
(2.11), (2.12), DSN, k:=0, for all i, and a=1. 


The proof readily follows from a consideration of Theorems 2.1 and 2.3, 
and, for the necessity of the two last conditions, the sequences x;=0, i¥j, 
x;=1, and the sequence x;=1 (i,7=1,2,---). 

Obviously the special case N =0 yields the well known conditions for 
regularity, namely the conditions under which lim,.;, g(¢) =limn.. %n- 

We shall now give two theorems which aré concerned with divergent se- 
quences. 


THEOREM 2.5. Let Q be a real non-negative constant. In order that S may be 
such that the limit points of g(t) shall lie in a circle of radius Qr whenever h=0, 
it is necessary and sufficient that the K;(t) satisfy the conditions (2.11), (2.12), 
and CSQ. 

In the proof we encounter difficulty only in connection with establishing 
necessity of the condition C<Q. We shall assume that (2.11) and (2.12) hold 
and show that the remaining condition also holds. 

Suppose on the contrary C>Q. Then for some \ >0 we have 


>| Kit) — > +50 
t=] 


1933] LINEAR TRANSFORMATIONS AND DIVERGENCE 891 


repeatedly as ¢ approaches ¢o. There exists, therefore, a sequence {t,} lying 
entirely in the range for which (2.12) is satisfied, and such that lim)... tp =¢o; 
and an increasing sequence of integers {,} for which the following inequali- 
ties hold: 


Mp-1t1 bed 
D | Kilts) — <d, — >O +5, 
t=1 


np 
| Kat.) kl <a, | | >O+3. 
tmng—1+2 
In the set Ki(t,) —k;, mp1rt+2SiSn,, there is surely one value K,(t,) —k, 
which is not zero. 


We now define a sequence having a limit circle of center zero and radius 


one* 


a; = (— 1)?-'sgn [Ki(tp) — ki], nit 2 SiS ny, 
= (— 1)? sgn [K,(t,) — &,]. 
We shall establish the desired contradiction if we show that the limit 
circle of 


= 


has a radius greater than Q. Write 


Mp_1+1 Np 


i=1 t=1 t=—np_1+2 t=npt1 
The first and third terms on the right are each less than ) in absolute value, 
and the real middle term is greater than Q+3A for p odd and less than 
—(Q+3h) for p even. Hence, writing R(z) =real part of z, 


R ES ~ | >O+A, p odd, 


< —(@Q+A), p even, 


and g(t) has a limit circle of radius greater than Q, which completes the proof. 

The conditions (2.11), (2.12), and CSQ, remain necessary but not suf- 
ficient when Theorem 2.5 is written without the hypothesis h=0. We can, 
however, state necessary and sufficient conditions if we restrict ourselves to 
a consideration of conservative transformations, that is, those transformations 


* We use the definition sgn (2) = |z|/z, 0, and sgn (z)=0, s=0. 


892 JOSEPH LEV [October 


for which lim;.;, g(#) exists whenever lim,.., x, exists. The conditions for 
conservatism are (2.11), (2.12), and D=0. Clearly in the conservative case, 
the condition C <Q is necessary and sufficient for the limit points of g(#) to 
lie in a circle of radius Qr. We can also take into account the position of the 
limit points and state 


THEOREM 2.6. Let Q be a constant, Q=1. In order that the conservative trans- 
formation S may be such that the limit points of g(t) shall lie in a circle of radius 
Or and center h, whenever {x,} is bounded, it is necessary and sufficient that 
S be regular, and that C<Q. 


An example due to W. A. Hurwitz yields an interesting comparison be- 
tween the work of this paper and that of Hurwitz and Knopp. Apply to the 
sequence x,=w?", w*=1, the transformation defined by K,,;=(—1)"wi/3 
(t=n,n+1,n+2), and K,,;=0,otherwise. The resulting sequence, g,=(—1)", 
has its limit points within the limit circle of the original sequence as is to be 
expected from our theory but the oscillation of {g,} is greater than that of 
{x,,} and one of its limit points lies outside the limit core of {x,}. 

3. Function to function transformations. I. In the following let f(x) be a 
complex function of the real variable x defined and integrable Lebesgue in 
each interval a<x<42,<£, where x is arbitrary and £ is finite or infinite. 

We shall call the following the transformation S,. Choose a point set T 
as in the definition of S, and a function Ki(¢, x) defined for each ¢ in T, and 
each x, a<x<&, integrable Lebesgue in each interval <x, <&, for each 
such that 


3 
Si: = f Kilt, s)f(s)ds 


exists for each ¢ in T. 
We shall now give without proof a theorem analogous to Theorem 2.1. 


THEOREM 3.1. Let f(x) be bounded axx<é. If Si is such that Ky(t, x) 
satisfies the conditions 


(3.11) lim '| Ki(t, s) — K,i(u, s) | ds=0,0a5 


tbe a 


(3.12) th | Ki(t, s)| ds < M, 


for all sufficiently advanced t, M a constant, then the quantities o the center and 
D, the radius of the limit circle of the function f° K,(t, s)ds, 


LINEAR TRANSFORMATIONS AND DIVERGENCE 


A, =a, — lim lim Kilt, s)ds, 
bole a 


B, = lim lim * Kilt, s)f(s)ds, 


a 


C; = limsup lim lim | Ki(t, s) — Kx(u, s)| ds 
exist, and the limit points of g,(t) lie in a circle of center H}=Ash+Bi, and 
radius Rx=Cw+D,|h|, where h is the center and r the radius of the limit circle 
of f(x). 

The sufficiency of theorems analogous to those in §2 can easily be estab- 
lished, but for a complete theory analogous to that in §2 we need the trans- 
formations in the next section. 

4. Function to function transformations. II. We shall call the following 
transformation S;. Choose a function K.(t, x) which has all the properties of 
K,(¢, x) and the additional property that K2(é, x) is continuous in x, uniformly 
for all sufficiently advanced ?#, and all x, a<x<q, where q is an arbitrary 
constant less than &. The transformation is then given by 


Ss: = f 
We can establish ; 


THEOREM 4.1. Let f(x) be bounded, axx<ét. If S2 is such that K2(t, x) 
satisfies the conditions 


(4.11) lim K,(t, x) = k(x),a S x <é, 


tle 


(4.12) f K3(t, s)| ds < M, 


for all sufficiently advanced t, M a constant, then the quantities az the center and 
D, the radius of the limit circle of the function {°K:(t, s)ds, 


f By = f * 


C2 = limsup | K2(t, s) — k(s)| ds 
bohe a 


exist, and the limit points of g(t) lie in a circle of center H,=A2h+Bz, and 
radius R2=C+D,|h|, where h is the center and ¢ the radius of the limit 
circle of f(x). 


1933] 893 


894 JOSEPH LEV [October 


The proof of this theorem may be made to depend upon that of Theorem 
3.1 by showing that k(x) is continuous a<x%<é, that K.(t, x) approaches 
k(x) uniformly over aSxq, for arbitrary g less than £, and that k(x) is 
integrable over a<x<&. The details will not be given here. 

We can now state 


THEOREM 4.2. In order that S, may be such that lim:.:, go(t) exists whenever 
h=r=0, it is necessary and sufficient that K2(t, x) satisfy the conditions (4.11) 
and (4.12). 


The sufficiency follows from Theorem 4.1. The necessity can be estab- 
lished by using the methods of Silverman [5], and Schur [1]; and a considera- 
tion of the fact that if f(x) is measurable a<x<b, then sgn f(x) is also 
measurable in this interval. 

The remaining analogues of the theorems in §2 can easily be stated and 
proved by methods suggested in that section, and will not be given here. 

5. Bounds of the sets of limit points. It is easy to see that in parts of our 
discussion we can replace the limit circle by some other circle which contains 
the limit points of the sequence or function. In particular we can replace it 
by a circle with center at the origin and radius equal to the maximum of the 
distances from the origin to the limit points. This radius which is a bound for 
the set of limit points may be written in the case of sequences as lim supp... 
|x,|. We can state 


THEOREM 5.1. Let Q be a real non-negative constant. In order that 
lim sup | g(#)| Qlim sup | x, | 
tbo no 


whenever {x,} is bounded, it is necessary and sufficient that the K;(t) satisfy the 
conditions (2.11), (2.12), C<Q, and k;=0, for all i. 


For the proof of necessity consider Theorem 2.5 and, for the last condi- 
tion, the sequence x;=0 (i=1, 2,--- ); and the sequences x;=1, x;=0, 
(i,7=1,2,---). 

6. Application to series. We shall generalize some results due to Schur 
[1] and Kojima [6]. 

Let the series wo+w:+w.+ --- , with partial sums W,, be the Cauchy 
product of the two series w+m+¢2+--- , and %+%1+%2+--- , with 
partial sums U, and V, respectively. We can write 


= UnVo + + n, 


which is a linear transformation on the sequence {V,}. If we suppose that 
> |u| converges and that { V,,} is bounded with limit circle of center # and 


1933] LINEAR TRANSFORMATIONS AND DIVERGENCE 895. 


radius r we can apply Theorem 2.1 to show that the limit points of {W,} lie 
in a circle of center and radius un|. 

Now write where 
and p20. If the Cesaro transform of order p, C2(u) =U%/A?, is bounded, 
the series }°u, is said to be bounded (C, p). Writing similar expressions for 
the series and we get W2**** g20, which may be 
written 


i=0 


1 
Astet 


If we regard this expression as a linear transformation on the C?(u) we 
get (¢=0, 1, 2,---, m), Knx=0 (i>). Then sup- 
posing that is bounded (C, g), we have 
<M /n>+, M a constant for all so that lim... K,,;=0. Further- 
more )-?_9|Ka.:| <N, N a constant for all m, and )\7.o9Kn.«=Cat**"(2). 
Hence if we call the center and radius of {C?(u)}, and {C2t***(0)}, hu, 
r., and h,, r,, respectively, we have 


THEOREM 6.1. If is bounded (C, p) and is bounded (C, q), p, 
then the sequence {C2***"(w)} has its limit points in a circle of center hI, 


If we consider the two series }.u, with partial sums s,, and > (cau, with 
partial sums ¢, we get 


i, = So(Co ¢1) + + Sn—1(Cn—1 = Cn) + SnCn- 


On the basis of the assumptions that {s,} is bounded with limit circle of 
center h and radius 7, and that }>*_9|ca—cay:| converges, we can show by 
means of Theorem 2.1 that the limit points of {#,} lie in a circle of center 
h- lita. Cat and radius r-lim,..| cal. 

Generalizations of the last result to the case when >, is bounded (C, p) 
for some p>0 can easily be arrived at on the basis of the work of Schur [1] 
and Kojima [6]. 


BIBLIOGRAPHY 


1. J. Schur, Uber lineare Transformationen in der Theorie der unendlichen 
Reihen, Journal fiir die reine und angewandte Mathematik, vol. 151 (1920), 
pp. 79-111. 

2. W. A. Hurwitz, Some properties of methods of evaluation of divergent se- 
quences, Proceedings of the London Mathematical Society, (2), vol. 26 (1927), 
pp. 231-248. 


| 


896 JOSEPH LEV 


3. W. A. Hurwitz, The oscillation of a sequence, American Journal of 
Mathematics, vol. 52 (1930), pp. 611-616. 

4. K. Knopp, Zur Theorie der Limitierungsverfahren, Mathematische Zeit- 
schrift, vol. 31 (1930), pp. 97-127, and 276-305. 

5. L. L. Silverman, On the notion of summability for the limit of a function 
of a real variable, these Transactions, vol. 17 (1916), pp. 284-294. 

6. T. Kojima, On generalized Toeplitz’s theorem on limits, Téhoku Mathe- 
matical Journal, vol. 12 (1917), pp. 291-326. 


CoRNELL UNIVERSITY, 
Irwaca, N. Y. 


GROUPS IN WHICH EVERY OPERATOR HAS AT MOST 
A PRIME NUMBER OF CONJUGATES* 


BY 
G. A. MILLER 


Let G represent a non-abelian group such that each of its operators is 
either invariant or has p conjugates, » being a prime number which is 
the same for every operator of G, and let s,; represent one of the operators of 
lowest order contained in G and non-invariant under G. The subgroup Hi 
composed of all the operators of G which are commutative with s, is of index 
p under G, and includes s; as well as the central Hp of G. It will be proved 
first that G involves only one Sylow subgroup of order ”. If it could contain 
more than one subgroup of order p” the powers of some operator of one of 
these would transform the cross-cut of two of them into itself and would also 
transform one of these two subgroups into itself and the other into exactly 
p distinct subgroups, since this operator may be so selected that its pth 
power is in this cross-cut but it itself is not contained therein. 

To prove that this operator would have more than p conjugates under G 
it is only necessary to observe that it would appear in one and only one of a 
set of » subgroups of order p* which would be conjugate under the powers 
of some operator contained in one of the given +1 subgroups of this order. 
As these p subgroups would contain an operator which would transform the 
given operator into at least one additional conjugate, the following theorem 
has been established: 

If a group contains more than one Sylow subgroup of order p™ then it con- 
tains a set of conjugate operators whose order is a power of p and whose number 
exceeds p, where pis any prime number. 

By hypothesis G transforms the operators contained in each set of con- 
jugates according to a transitive permutation group of degree » and it has 
just been proved that this transitive group cannot involve more than one 
subgroup of order # since such a subgroup is a Sylow subgroup therein. As 
G must be isomorphic with every such transitive group it follows directly 
that each of these transitive groups is cyclic. Hence G must be isomorphic 
with an abelian group of type (1, 1, 1, - - - ) whose order is a power of p. 
That is, G is the direct product of a non-abelian group of order p™ and an 
abelian group whose order is prime to p. In what follows it may therefore be 


* Presented to the Society, June 19, 1933; received by the editors March 13, 1933. 
897 


898 G. A. MILLER [October 


assumed for the sake of simplicity that the order of G is p. Since every sub- 
group of index p under a group of order p” is invariant thereunder it follows 
that H; is an invariant subgroup of G. 


An operator of G which is not found in H, transforms s; into sos; where So 
is commutative with s; since H; is an invariant subgroup of G and all of its 
operators are commutative with s,. Hence it results that so is of order p. To 
prove that so appears in the central of G it may be noted that it must be 
transformed into itself by each of the operators which do not appear in H; 
since every such operator must be commutative with a subgroup of index p 
under H, and sp must be in such a subgroup since it arises from a p automor- 
phism. If ¢ were a non-invariant operator of G which would not be transformed 
into itself multiplied by a power of so, then G would involve operators which 
would be commutative neither with s; nor with ¢. Such an operator would 
therefore be transformed into more than p conjugates under G. It has there- 
fore been proved that if every non-invariant operator of a group has a given 
prime number of conjugates under this group then the order of the commutator 
subgroup of this group is this prime number and its operators are invariant under 
the group. 

When 2H, is non-abelian it contains an invariant subgroup Hz composed 
of all of its operators which are commutative with one, 52, of its operators of 
lowest order. By continuing this process we finally arrive at an invariant 
abelian subgroup H, which involves 51, sz, - - - , 5, as well as Ho. A set of 
independent generators of H, can be so selected that it includes s1, 52, - - - , Sa 
since these were always chosen so as to be of the lowest possible order. As Ho 
includes the pth power of every operator of G it includes the pth powers of 
Si, Sa, - + + » S, and the central quotient group of G is abelian and of type (1, 1, 
1,--.-). In particular, G/H) has these properties. It should be noted that 
H is the direct product of the group generated by the pth powers of si, 
Se, - + + , S, and some other group which may be the identity. The order of 
G/Ho is p® while that of is 

To exhibit the fact that the preceding theorems relate to an extensive 
category of groups it may be noted that for every value of m>2 there are 
groups which belong to this category and that the number of these groups 
increases with m. In particular, the two non-abelian groups of order /* belong 
to this category and six of the non-abelian groups of order p* belong thereto. 
When /=2 it is known that there are nine non-abelian groups of this order 
and when />2 their number is always ten. Hence more than one-half of the 
non-abelian groups of order p* come under the heading of the present article. 
For all of these groups \=1 since Hy cannot be the identity and the order of 


1933] GROUPS WITH A CERTAIN PROPERTY 899 


G/H, is always p”, as was noted above. Whenever m>4 then A can ob- 
viously have more than one possible value. 

When m is given and greater than 2 the possible values of \ depend upon 
the type of the abelian group which is selected for H, since \ may assume 
successively the values 1, 2, - - - up to the number of the invariants of this 
abelian group if at least one of these invariants exceeds ~. When all of these 
invariants are equal to p the value of \ may be any positive integer which 
does not exceed the number of these invariants diminished by unity in view 
of the following theorem: 

If a non-abelian group G in which all the non-invariant operators have 
exactly a prime number p conjugates contains a non-invariant operator s, of 
order p, and if the subgroup composed of all of its operators which are commuta- 
tive with s, is either abelian or contains a non-invariant operator sz of order p, 
etc., then the operators s:, S2,+ ++, S, generate an abelian group of order p» 
which does not include the commutator subgroup of G. 

To prove this theorem it is only necessary to note that this abelian sub- 
group contains no operator besides the identity which is invariant under G. 

To construct all the groups of order »” which satisfy the conditions im- 
posed on G at the opening of this article we may first consider those in which 
\=1, then those in which \ =2, etc., until we arrive at those in which \ has 
its largest possible value, viz., (m—1)/2 when m is odd and (m—2)/2 when 
m is even. For the central of such groups we may take successively every pos- 
sible abelian group of order p"-, and two groups in which the centrals are 
distinct groups must themselves be distinct, so that we may avoid duplicates 
by classifying these groups of the same order according to their distinct 
centrals. When H) is cyclic, \=1 since Hy involves si, se, - s,asindepend- 
ent generators. Moreover, G must then be the quaternion group since H, 
is of index » and G must then involve +1 cyclic subgroups of this index: 
It is well known that the quaternion group is the only group of order p”, 
m>2, which involves +1 cyclic subgroups of index p. 

When A=1 general formulas for the totality of the non-abelian groups 
which come under the heading of the present article may be obtained as 
follows. Suppose first that all the invariants of Hy are equal to a fixed number 
p*. It is known that if any abelian group has a subgroup of prime index then 
it is always possible to find a set of reduced independent generators of this 
group which has the property that all the operators of this set except one 
appear in this subgroup.* Hence a set of reduced independent generators of 

* G. A. Miller, Bulletin of the American Mathematical Society, vol. 23 (1916), p. 14. The term 


“reduced set of independent generators” was used with its present meaning with respect to abelian 
groups in these Transactions, vol. 16 (1915), p. 22. 


900 G. A. MILLER [October 


each of the +1 abelian subgroups of index » contained in G can be so 
selected that all except one of them appear in Ho. The additional] independent 
generator of such a reduced set can therefore be so selected in the present 
case that its order is either p or p**+'. In what follows it will be assumed that 
such a selection of a set of the reduced independent generators of these sub- 
groups has been made. 

When Hj is cyclic there are always two possible groups. In the special 
case when p=2 and m=3 these are the octic and the quaternion groups 
while in all the other cases the additional generators of two abelian sub- 
groups of index » under G can be so selected that either both are of order p 
or one is of order ~ and the other is of order p”-'. When H, has two equal 
invariants p* there are four groups. In one of these the additional generators 
of two of the abelian subgroups of index p can be so chosen that both are 
of order p. In two others one of these generators is of order p while the other 
is of order p**! except when p=2 and m=4. In this special case there is only 
one such additional group while there are two groups in which all the ad- 
ditional generators are of order p**". In the other cases there is only one such 
group. When the number of the equal invariants of Hp exceeds two there is 
one additional group in which all the additional independent generators are 
of order p*+* since the commutator subgroup for such groups can then be 
chosen in two distinct ways. When p =2 and m=4 this commutator subgroup 
can be chosen in three different ways but there are then only two distinct 
groups under the other cases. 

For the sake of simplifying the consideration of the general case when \ = 1 
we let ; represent the number of the different values of the invariants of Ho, 
ke the number of the sets corsposed separately of all the equal invariants 
whenever such a set involves at least two such invariants, k; the number of 
such sets such that each set involves at least three equal invariants. The num- 
ber of the distinct groups of order »” which contain this Hy and in which the 
additional invariant of each of two abelian subgroups of index p is equal to 
pis then k;, since Ho contains k,; sets of subgroups of order p such that each 
set is composed of all of its subgroups of this order which are conjugate under 
the holomorph of Ho. 

If one of the +1 abelian subgroups of index / under G has an invariant 
which is equal to p but is not included among the invariants of Ho, while 
another has a larger invariant having these properties, there are k,?+he 
groups of order p” since the commutator subgroup may be taken from any 
one of the &; sets of conjugate subgroups of order p under the holomorph of 
H, and the second independent generator which does not appear among the 
independent generators of H, may have its pth power in any one of the k; 


| 


1933] GROUPS WITH A CERTAIN PROPERTY 901 


sets of conjugate operators such that each set is composed of all the operators 
which can be separately used as an independent generator of Hy. In the 
special case when at least one of the invariants of Hp is 2, one of the groups of 
this case is included among the &; groups defined in the preceding paragraph. 
Hence there are then only k;?+2.—1 additional groups. 

Finally, when none of the additional invariants is equ] to p the pth 
powers of the independent generators of the +1 abelian subgroups of index 
p under G which are not also independent generators of Ho are equal to 
distinct independent generators of Ho except possibly when p=2 and these 
additional invariants are equal to 4. In the latter case these operators of 
order 4 may generate the quaternion group and then the commutator sub- 
group of G is the subgroup of order 2 contained in this quaternion group. 
In all other cases two equal additional independent generators can be se- 
lected in 2 essentially different ways while two such unequal generators 
can be selected in k:(k:—1)/2 essentially different ways. 

In the former case the number of the distinct groups when none of these 
invariants is equal to 2 is kike+ks since in ks cases the subgroups of order p 
in Ho which are conjugate under its holomorph are not conjugate in the holo- 
morph of G. In the latter case the number of these groups is k:*(k:—1)/2 
+hko(ki—1). Hence the total number of the distinct G’s when A=1 and Hy 
does not involve an invariant which is equal to 2 is 


hi + + 1)/2 + 2hike + hs. 


This formula gives also the correct number of groups when H, involves at 
least one invariant which is equal to 2. It was noted above that in this case 
the number of groups in which only one additional invariant of the p+1 
abelian subgroups of index # under G is # is one less than in the other cases, 
but the number of the groups in which all the additional invariants are equal 
to #2 is then one more than in the other cases in view of the existence of the 
quaternion group. When H, involves more than one invariant which is equal 
to 2 and the commutator of order 2 is the square of one of the additional 
independent generators of order 4, and has a different square, then the 
operators of order 2 in the group generated by these operators of order 2 are 
not conjugate under its holomorph while they are thus conjugate in the other 
cases. This however does not affect the number of the possible distinct groups 
in this case. 

If a group of order p™ ccuatains more than one abelian subgroup of index 
p then the cross-cut of two such subgroups is its central and its commutator 
subgroup is of order p. Hence such a group belongs to the category of groups 
defined by the heading of the present article and the value of \ in this case 


. 


902 G. A. MILLER 


is unity. The formula given in the preceding paragraph therefore gives also 
the number of these distinct groups whenever we use successively for Ho the 
different possible abelian groups of order p”~*. Since every group of order * 
contains at least one abelian subgroup of order #°*, it results that the only 
non-abelian groups of order ~* which are not enumerated by the given 
formula are those which contain only one abelian subgroup of index p. There 
are three such groups when p=2, but whenever p>2 there are four such 


groups. 


University oF 
Ursana, Itt, 


POLYNOMIAL DIOPHANTINE SYSTEMS* 


BY 
E. T. BELL 


1. Formal definitions of polynomial diophantine systems, as understood 
in this paper, are given in §5, after the necessary algebraic preliminaries in 
§2 and certain arithmetical considerations in §4. The algebraic structure of 
the systems is stated in §3. Roughly, the systems are of the following type. 

As always henceforth, let n denote an arbitrary constant (finite) integer >1. 
Let a, b, - - -,c be mn (m finite, 21) integers >0, and let P., Qs,---, Ry 
(a=1,---, B=1,---, y=1,---+, ¢) be polynomials in sm 
independent variables x, - - - , xs, (s finite, >1) with integer (not necessarily 
rational integer) coefficients. Let the system 2, 


be consistent and indeterminate. 

The systems 2 considered have an infinity of integer solutions, all of 
which can be given explicitly by expressing %, - - - , %s. aS polynomials in 
parameters ranging over all integers, with integer coefficients, and for this 
complete solution only an application of the fundamental theorem of arith- 
metic (unique prime decomposition) is necessary. Homogeneous and in- 
homogeneous systems > are treated by the same analysis, and the degrees 
of the polynomials are unrestricted. The simple algorithm for obtaining the 
complete solution in integers is indicated in §7, with examples.t 

The remarkable features are the complete, explicit, solvability and the 
intimate connection with the fundamental theorem. In these respects systems 
> are an immediate generalization in one direction of the Pythagorean equa- 
tion x?+y? =z? and its complete solution in rational integers. Naturally, the 
polynomials composing a system = can not be given arbitrarily; applicability 
of the fundamental theorem imposes necessary restrictions. We proceed to 
the algebra sufficient for the construction of general systems 2. 

2. Let R be an abstract commutative ring in which the identities with re- 
spect to addition, multiplication are z, w respectively, and in which the sum, 
product of any elements a, } in R are written a+), ab respectively. 


* Presented to the Society, October 28, 1933; received by the editors May 26, 1933. 

T In all of the examples constructed (only 3 are reproduced here) all of the polynomials in the 
systems are irreducible in the ring of their values, but I can not prove that all polynomials in the most 
general system constructed are irreducible. 


903 


904 E. T. BELL [October 


Since R may contain nilfactors, ax=bx, xz do not necessarily imply 
a=b. In the special case when R is a domain of integrity, the cancellation law 
holds. Conversely, in the special case when cancellation holds, R is a domain 
of integrity. For if pq =2, then pq = pz, and hence g a contradic- 
tion. Unless so stated it is not assumed that R (or any other commutative 
ring) is a domain of integrity. 

The elements of R will be called integers; to avoid confusion, 0, +1, +2, 

* + will always be characterized as rational integers. 

Integers other than z, « will frequently be indicated by a multiple index 
notation, 7), y(k, t), - - , 4.37), the 7, k, ¢ being the indices. In a 
symbol with precisely 2 indices, say x(, 7), the first index (z) ranges over all 
rational integers >0, the second (7) ranges over only 1, - - - , . 

A symbol with precisely 1 index, say x(z), denotes a vector (one-rowed 
matrix) of m integers, 

x(t) (x(i, 1), x(i, n)); 


the jth coordinate of x(i) is x(i, 7). Vectors being matrices, vector equality, 
x(i) =y(k), is matrix equality, x(i, j)=y(k, j) (j=1, - , n). 
It is postulated that there exists in R a set ¢ of integers $(i, j, k) (i, 7, k=1, 
- +, m) such that 


(2.1) o(1, s, k) = dex (s,k =1,---,m) 
= Sox = 2,5 
(2.2) j, k) = o(j, i, k) (i,j,k = 1,--+, 9%); 
j=1 

Let cp;(p, 7=1, - -, m) be integers, such that (7=1, - - - , m) and 
the determinant |c,;| (p row, 7 column) of the matrix ||c,,|| has the value . 
Let c,; denote the cofactor of cp; in |cp;|. Then c,;=u, and 


Spr = ir (p, 1, n). 
j=l 


j=l 


Define the set ¢’ of integers $’(r, s, (r, s, t=1, - - - , m) by 


¢’(r, 5, t) = k, h). 


Then it is easily seen (by manipulation of dummy suffixes as in tensor algebra) 
that 


H 
n n 
n 


1933] POLYNOMIAL DIOPHANTINE SYSTEMS 905 


Similarly, and by using (2.1)—(2.3), we see that the $’(r, s, #) satisfy the same 
relations: the symbol ¢ in (2.1)—(2.3) can be replaced by ¢’. We shall say 
that the sets ¢, ¢’ are equivalent, ¢~¢’. The relation of equivalence is re- 
flexive (¢~¢), symmetric (if ¢~@’ then ¢’~¢@), and transitive (if ¢~@’ and 
¢’~¢"’, then ¢~¢’’). The consistency of (2.1)—(2.3) need not be discussed, 
as instances of ¢ will be evident when diophantine systems are constructed. 

The jth coordinate, denoted by x(i., %; 7), in the product x(i.)x(i) of any 
vectors x(ia), x(t»), to the base ¢, is defined by 


(2.4) x(ta, J) > o(Ja; Jo; J) Ja) je) G =1,--: n). 
igs 
In a given context the same base ¢ is presupposed. All equations persist 
if ¢ be replaced by ¢’, where ¢~¢’. 
The notation u(z)(=u(k), ---) is reserved for the vector defined by 
u(i, 7) =5;;(j =1, - - - , m). Hence, by (2.1), (2.4) we have 


(2.5) u(i)x(k) = x(k) 


for all x(k). If possible, let v(z)x(k) =x(k), v(t) u(t), all x(k). Choosing 
x(k) =u(k), and referring to (2.5); we have the contradiction v(z) =x(2). 
Hence u(i) is the unique identity of vector multiplication. 

The notation 2(z)(=z(k),---) is reserved for the vector defined by 
2(i, 7) =2(7=1, - - - , m); 2(4)x(k) =2(2) for all x(k). 

The sum x(i.) +x(%) is the vector whose jth coordinate is the sum (in R) 
of the jth coordinates of x(i,), x(%) (j=1,---,m). Since z is the unique 
identity of addition in R, 2(z) is the unique identity of vector addition. 

There can be no confusion between operations in R and the correspond- 
ing operations on vectors, since the notation for the operands indicates the 
species. 

Now (2.2), (2.3) are necessary and sufficient conditions for commuta- 
tivity and associativity of multiplication in any linear algebra with n basal 
units, and (2.1) is a necessary and sufficient condition for the existence of an 
identity with respect to multiplication in the algebra. Since any commuta- 
tive, associative linear algebra with an identity of multiplication has a vector 
representation with vector multiplication and addition as above defined, it 
follows that the set, VR, of all vectors is a commutative ring, in which the identi- 
ties of multiplication, addition are u(i), 2(%) respectively. 

The notation e«(z), e(k), - - - will be reserved for units in VR, which are 
defined as follows, and which are not to be confused with the usual unit 
vectors of linear algebra. If €(z:) is in VR, and if €(i.) exists in VR such that 


e(i)e(i2) = 


906 E. T. BELL [October 


e(i;) is a unit in VR, and e€(iz) is its conjugate. Hence the conjugate of a unit 
is a unit. Since u(i) is its own conjugate, units exist. If possible, let €(z) =2(z). 
Then e’(i) =2(z)e’(z), where e’(i) is the conjugate of ¢(z). Hence the con- 
tradiction (in R) «=z. Thus no unit is equal to the zero in VR. 

Suppose for a monent that VR is a domain of integrity. If possible, let 


€(i:)e(i2) = u(t) = €(i2) e(is). 


Then e(i:) [e(i2) — €(is) ] =2(i). With this gives the contradiction 
e(i2) = €(i3). Hence, if VR is a domain of integrity, the conjugate of a given 
unit is unique. 

We return to the general VR. In order that e(z;), €(iz2) be conjugate units 
it is necessary and sufficient that 


clin, 8) = Sag 


Let a(i), b(z), be such that 
a(t) = a(i)&(z), = 
If a(z), b(i) are given, a solution of these equations is of the type 
(i) = a(i)e(i), B(i) = E(z) = €’(4), 


where ¢(i), e’(i) are arbitrary conjugate units. If this type exhausts the solu- 
tions, a(z), b(i) are said to be coprime (in VR). Necessary and sufficient con- 
ditions that a(z), b(z) be coprime are that the system 


or, s, r)é(i, s) = afi, 


s, AB(i, r)E(i, s) = 


NEG, NEG, 8) = 


be solvable in R for 
a(i, r), B(i, r), r) (r, 1, 


3. Let x(1), - - - , x(s) be any vectors, and let s>2. The jth coordinate 
in the product x(1) - - - x(s) will be denoted by x(1, - - - , s;7). By mathe- 
matical induction from (2.4) we find for x(1,---, s; 7) the following ex- 
plicit polynomial expression in R: 


lg --, a. 
r 
(2.6) 


1933] POLYNOMIAL DIOPHANTINE SYSTEMS 


Xx x(1, 71)x(2, je) x(s, je) (i, Jes = 1, n). 


When x(i) =x(i;)= --- =x(i,)(t>1) the product x(i) - - - x(,) is written 
x‘(i), and its jth coordinate x(i; 7). Hence, the case ¢=2, s:=s,=1, 
included, we have defined x*(i:) - - - x**(i,) and its jth coordinate x(i,“, 

j). 

If the m coordinates of x(i) are independent variables in R, x(z) is called a 
variable (vector) in VR. The variables x(i;), - - - , x(:) in VR are said to be 
independent if their nt coordinates are nt independent variables in R. Denote 
the power product x"(i:) - - - x**(,), where si, - - - , are rational integers 
>0, of independent variables x(i:), - - - , x(z,) in VR by X(#), witha similar 
notation for any product of positive integral powers of independent variables 
in VR. The r power products Xi(h), - - - , X-(#,) (r>1) in VR are said to be 
independent if all the variables in VR composing these r products are inde- 
pendent in VR. 

Denote the jth coordinate of X;(#;) by 7), and let Xi(4),---, 
X,(t,) be independent. Then the equations 


(3.1) Xi(t1) err 
in VR are equivalent to the simultaneous system 
(3.2) Xi(ti; = +++ = J) 1,---,@) 


in R, as each of (3.1), (3.2) implies the other. With a, b, - - - ,c, mas in §1, 
we pass to the general case. The power products in each row of 

Xi(p1) Xa(pa), 

Yi(qi:) = -- = 


(3.3) 


Zi(ri) = = 


are independent; in any pair of rows, at least one product in one of the rows 
and one product in the other are not independent; the system (3.3) does not 
separate into two or more systems with the two preceding characteristics in 
sets of independent variables in VR having no variable in common. The set 


Xi(pi; = +++ = J), 


(3.4) Vi(qi; 7) = = J), 


Ziln; = =Ze(re3 7) =1,---,) 
in R, equivalent to (3.3) in VR, will be called a polynomial system. 


907 


908 E. T. BELL [October 


If the complete solution in integers of a polynomial system is obtainable 
in explicit form in terms of polynomials in integer parameters with integer 
coefficients, we shall say the system is diophantine. 

It will be seen that a sufficient condition that a polynomial system in R 
be diophantine is that the fundamental theorem of arithmetic shall hold in 
VR. A generalization of (3.4) including arbitrary constant coefficients is 
noted in §7. 

4. Unique decomposition is understood here in the strict sense, as in 
rational arithmetic or in the theory of ideals in an algebraic number field. 
For precision the postulates are stated. The notation in this section is inde- 
pendent of that in the rest of the paper. 

Let © denote a set of at least two distinct elements a, b, - - - , for which 
the postulates (4.1)—(4.7) hold. 

(4.1) Equality is significant in 2; a=b or ab; equality is symmetric, 
reflexive, and transitive. 

(4.2) There exists a binary operation which can be applied to any pair 
a, b of elements of Q, in this order, to produce a unique element, denoted by 
ab, in Q. 

(4.3) ab=ba; a(bc) =(ab)c, for all a, b, cin Q. 

(4.4) If 2 contains z such that 2x =z for all xin Q, 2 is unique. 

(4.5) If Q contains u such that ux =~ for all x in Q, uw is unique, and uz. 

(4.6) If 2 contains the z in (4.4), and ab=z, then a=z or b=z (or both). 

(4.7) If ax=bx, xz (if 2 contains z), then a=5; if 2 does not contain z, 
then ax=bx implies 

We need not discuss the independence of (4.1)—(4.7). The consistency is 
obvious from numerous instances. Note that only one binary operation is 
postulated. 

If p, g, 7 are any elements ~z of 2 such that p=gr, we say that r divides 
p, and write r|p (hence also q|p). In all questions of divisibility z is hence- 
forth excluded. 

If u as in (4.5) exists, and e|u, ein Q, ¢ is a unit. Hence u=ee’, and e’ is 
a unit; €, e’ are conjugate units. Obviously wu is a unit. If €, - - - , €, are units, 
and e{,---, € their respective conjugates, and are 
conjugate units. From (4.5), (4.7), a unit has a unique conjugate. If x |a and 
x |b imply that x is a unit, a, b are coprime. If a |b and 5 |a, a, b are associates, 
a~b. From a~éd follows a=eb, unit. 

An element 4 in @ other than a unit such that x|4 only when x~h or a 
unit, is irreducible. An irreducible element p is prime if p|ab implies at least 
one of p|a, p|b. (This amounts to making all irreducibles primes—not the 
case in general in an algebraic integer ring.) 


1933] POLYNOMIAL DIOPHANTINE SYSTEMS 909 


If d|a and d|b imply d|g, g|a, g|b, g is the G.C.D (by definition) of a, b. 

We define © to be an arithmetic with respect to the binary operation in 
(4.2), and write A Q, if the postulates (4.8), (4.9) hold. 

(4.8) If b is any element of Q, there exist only a finite number of elements 
x; of Q different from units such that x; |b. 

(4.9) Apart from permutations of fi, ---, ~,, every element of Q 
is uniquely expressible in the form b= ef, - - - ~,, where € is a unit and fy, 
‘++ , p, are primes. 

Rational arithmetic and the theory of algebraic numbers and ideals pro- 
vide several instances of A 2. We have not attempted to state a minimum set 
of postulates sufficient for unique factorization, as we are concerned here only 
with the application to be made presently of the fundamental theorem to 
diophantine analysis. In particular, (4.9) is a consequence of the rest, which 
can be weakened. An exhaustive study of postulate systems leading to (4.9) 
has been made by Professor M. Ward in an unpublished paper. 

The following consequence of the postulates will be required. If a |bc, and 
a, b are coprime, then a |c. For, the hypotheses are equivalent to ad=bc, with 
a, b coprime. Let p|a, where p is prime. Then p|d or p|c. But p |b is impos- 
sible. 

5. We return to polynomial systems as defined in §3. A polynomial system 
will be characterized as diophantine if all integer values of the independent 
variables satisfying the system can be given explicitly by expressing the 
variables as polynomials with integer coefficients in a finite number of inde- 
pendent parameters ranging independently over all integers. 

It will now be shown that any instance, say AVR, of VR which is an 
arithmetic in the sense of §4 with respect to vector multiplication as in §2 
provides an infinity of polynomial diophantine systems. 

The system (3.3) is purely multiplicative. Hence, since we are now operat- 
ing in AVR, the method of reciprocal arrays developed in a previous paper* 
can be applied to obtain the complete solution of (3.3) in parametric form. 
The solution expresses each of the independent variables (elements of A VR) 
as power products of parameters in AVR. The method of arrays is applicable 
because it refers to any arithmetic as defined in §4. For clearness we illustrate 
the process by giving the first step in the proof, from which (as in the paper 
cited) the rest follows by mathematical induction, in the form adapted to the 
present discussion. 


* American Journal of Mathematics, vol. 55 (1933), pp. 50-66. 


910 E, T. BELL [October 


For simplicity, let Greek letters denote elements of AV R for the moment. 
We shall find all a, 8, 7, 6 such that 


ap = 76. 


Denote the G.C.D. of a, y by ¢. Then a=om, y=oy:, where a, 71 are co- 
prime. Hence a,8=7;6, and therefore (by the definition of divisibility and 
the last of §4) a: |5 and yi |8. Thus 6=a15:, 8 =7:8:. With the given equation 
this yields 8; = 4,. Denote the common value of 6, 5: by r. Then the complete 
solution is 


a=ao,8 = = y10, 6 = air. 


Moreover, it is sufficient to choose only such values of the parameters as 
are coprime.* 

Consider now (3.3). The independent variables, say £, 7, ---, §, are in 
AVR. Let the p, o, - - - , r be parameters ranging independently over all the 
elements of AVR. The method of reciprocal arrays gives the complete solu- 
tion of (3.3) in the form 

1 a, by 


1 


ky ly 
where the exponents a, 5, - - - , m are constant rational integers 20, deter- 
mined by the particular forms of the power products in (3.3), and denotes 
u for all @ in AVR. Since the variables £, n, - - - , § are independent in AVR, 
and (say) f=(&, En), n=(m, Nn); fn), the 
variables &;, ;,---, €(j=1,---, ) are independent in R. But these are 
precisely the independent variables in (3.4). Since each of (3.3), (3.4) implies 
the other, the complete solution of (3.3) yields all sets of integers (elements of 
R) satisfying (3.4) when £;,n;, - - - , ¢; are equated respectively to the jth 
coordinates in the above power products giving the general solution &, 7, 
*+ +, ¢ of (3.3). The jth coordinates in question are written down as ex- 
plained in §2, and are polynomials, with rational integer coefficients, in the 
coordinates of the p, a, - - - , r. But these coordinates are parameters in R. 


* In this simple example, all solutions (a, 8, y, 5)=(a:9, 717, yi, 17) are run through once only 
as coprime a, y: and arbitrary , 7 run through the elements of A VR. But in more complicated equa- 
tions, the same solution may be given more than once. This however does not affect the statement 
that all solutions are given. : 


1933] POLYNOMIAL DIOPHANTINE SYSTEMS 911 


6. It remains to be shown that the theory is not vacuously true. For this 
it is sufficient to produce instances of AVR. 

Let R(w) =R(w,--+-, wn) be an algebraic extension of R, and let 
wi, °° * , @, bea basis of R(w) with the multiplication table 


win, = Dlr, 5, jus. 


j=l 


If now w:=u, we may write 


x) = Dali, Aas. 


If in particular R(w) is the ring of all algebraic integers in an algebraic 
number field (relative to the rational field), w.=1, and the x(z) run through 
all integers of the field. Hence, if the field is such that unique factorization 
(without the introduction of ideals) holds, it is an instance of AVR. It can 
be shown conversely that any AVR is isomorphic with an algebraic integer 
ring. 

If in the algebraic integer ring R(w) there is not unique factorization, we 
replace the integers by the principal ideals which they generate; §4 is then 
applicable. But the application to (3.3) as in §5 does not then yield the solu- 
tion of (3.4) practically, although it does theoretically, on account of the 
following elementary difficulty in the theory of ideals: Given the bases of 
two general ideals A, B to exhibit the basis of their product in terms of the 2n 
integers defining the bases of A, B. The use of canonical two-term bases does 
not remove the difficulty. If general in the preceding be replaced by specific, 
so that the bases of A, B are expressed in terms of given integers, the problem, 
so far as it concerns algebraic numbers, is solvable in a finite number of steps. 
But in that case, there is no diophantine problem (3.3) or (3.4). The general 
existence proof concerning a basis of R(w) seems to lead to nothing usable 
for diophantine analysis. 

7. The system (3.3) and its equivalent (3.4) are more restricted than is 
necessary. Each power product in (3.3) may be replaced by an arbitrary con- 
stant integer multiple of itself. For the discussion of (3.3) in this more general 
case we refer to an article in the Bulletin of the American Mathematical 
Society for 1933. By referring to §3 it is easily seen what the corresponding 
(3.4) has become: arbitrary integer coefficients are introduced. 

In the papers cited, several illustrative examples of (3.3) have been given, 
and any desired number can be written out. In conjunction with any alge- 
braic number field in which there is unique factorization, any such example 


i 


912 E. T. BELL [October 


gives a polynomial diophantine system and its complete solution. An example 
is given presently. 

In the second paper cited, systems not in the form (3.3) but reducible to 
that form by linear homogeneous substitutions on the variables, with in- 
teger coefficients, were discussed. For example, x?+y?=w?, x?+y?=w?+?, 
and 


(xtyt wi +P + = + y+ 23. 


Such transformed (3.3) are completely solvable, and hence also the corre- 
sponding transformed (3.4). 

The notation developed in §3 for the jth coordinate in any power product 
in VR enables us to state in concise form the system (3.4) equivalent to a 
given (3.3) and to write down the complete solution of a specific system (3.4) 
from the complete solution of the equivalent (3.3). The last is obtained di- 
rectly by the algorithm of reciprocal arrays. If the explicit polynomial ex- 
pressions of the coordinates are required, they are given (for a fixed base ¢) 
by the first formula in §3. A simple example, where (3.3) consists of only 
one equation, will suffice. 

The complete solution in A Q of the equation 


(7.1) xs = ylw 

is found by the method of arrays to be 

x = mabcf{ghpgr, 
y = mgh(af)*p’, 
t = mca(bg)*q°, 

w = mbf(ch)*r, 


(7.2) 


where m, a, b, c, f, g, h, p, g, y are parameters ranging independently over all 
elements of A 2. We shall omit the G.C.D. conditions which may be imposed 
if desired, as they do not affect the generality of the solution. 

By a mere change of notation (7.1) becomes (7.3) in AVR, 


(7.3) = o(y)o(¢)o(w), 


which is equivalent in R to the simultaneous system (corresponding to (3.4)), 
(7.4) 7) = o(y, t, w; 7) G=1,---,m). 


The jth coordinates written in (7.4) are homogeneous polynomials in R of 
degree 3, whose explicit forms can be written down by the first formula in §3. 
The complete solution of (7.4) is written down similarly from (7.2): 


<j 4 


POLYNOMIAL DIOPHANTINE SYSTEMS 


v(x; j) = v(m, a, b, ¢, f, h, 9, 75 9); 
j) = v(m, g, h, a, p), 
v(t; j) = v(m, ¢, a, g, g), 
v(w; j) = v(m, b, f, h®, r) (j=1,---,n). 


Thus the 4 independent variables in (7.4) are given parametrically in the 
complete solution in terms of 10 integer parameters. 

For the complete solution of a given system (3.4) equivalent to (3.3) in 
an AVR which is algebraic of degree m it is necessary to select algebraic num- 
ber fields of degree in which there is unique factorization, and to construct 
the multiplication table w,w,(r, s=1, - - - , m) for the basis, in order to get the 
o(r, s,7)(7=1, - --, m). For n=2, 3, 4 only is there sufficient knowledge ex- 
tant to enable us to obtain the general ¢. For no is it known in all of what 
fields of that degree there is unique factorization; if » =2 any field with class 
number 1 may be used, but not all such fields are known; if »=3 there are 
numerous special fields known. For » 24, the ¢ can also be obtained for some 
special fields. Although there is nothing approaching generality in the avail- 
able data concerning algebraic fields which is necessary for the application to 
diophantine analysis, nevertheless an infinity of completely solvable poly- 
nomial diophantine systems exist, and any number can be constructed from 
a single algebraic AVR alone. 

As the entire subject originated in the Pythagorean equation x?+~y?=?’, 
we shall state the most general system equivalent to this and solvable com- 
pletely in rational integers by the methods of this paper. Let d denote a non- 
zero rational integer, and for simplicity restrict d to have no square factor >1 
(a restriction easily removed). Write D=4d if d=2 or 3 mod 4, D=d if 
d=1 mod 4; B=—}D(D-—1). Then B is a rational integer. Let the field 
generated by d”/? have class number 1. Then the system in question is 


xe + y? — 2? — we + B(x? + y? — 2? — w?) = 0, 
2(x1%e + Vive — 2122 — WiwWe) + D(x? + y? — 2? — w?) = 0. 


As the complete solution of this in rational integers x;, y;, 2;, w; (7=1, 2) is 
somewhat more detailed than that of the next, which is equivalent to it, 
we shall conclude with the complete solution in rational integers &;, 9;, Ai, ui 
(¢=1, 2) of the system 


+ Bnine = Arde + 
Eine + + Doin: = Aime + + 


Let the a;, B(j=1, - - - , 4) be parameters ranging over all rational integers 
independently. Then the complete solution is 


1933] eS 913 


E. T. BELL 


= + m1 = + + DBsBs, 
= + BB2Bs, n2 = + + 
Ar = + BBiBs, wi = + + 
Ae = + BB2B3, = a283 + a382 + DB28s. 


This follows at once, by the algorithm described, from the solution of aB =i 
in §5. Note that nothing has been proved if the class number exceeds unity. 

Finally it may be stated that the number of parameters appearing in any 
solution obtained by the algorithm is both necessary and sufficient for the 
complete solution. This is a consequence of the like for any application of 
reciprocal arrays. 


CALIFORNIA INSTITUTE OF TECHNOLOGY, 
PASADENA, CALIF. 


ii 
| 
914 


SECTIONS OF POINT SETS* 


BY 
DEANE MONTGOMERY 


1. INTRODUCTION 


A section of a plane point set E is defined as that subset of EZ which con- 
tains all points of Z lying on a line L. If L is a horizontal line the section is 
called a horizontal section and if L is a vertical line, the section is called a 
vertical section. It is the purpose of this paper to study the relations between 
E and its horizontal and vertical sections. Kuratowski and Ulamf, Sier- 
pinskif, and Fubini§, have considered various phases of this problem. Bairell, 
Hahn], Kempisty** and others have considered the closely related problem 
of finding the relations between a function f(x, y) and the functions obtained 
by holding x or y constant. 

In order to state results in a general manner, £ will be regarded as a sub- 
set of a combinatorial product space A XB where A and B are metric spaces 
and B is separable. Such a space is defined as the collection of all pairs of 
points (x, y), x being a point of A and y being a point of B. The distance be- 
tween (x1, and (x2, ye) is here defined to be The 
plane is a special case of such a space in which A and B are straight lines, 
and all the results of this paper apply to the plane and also to an (m-+n)- 
dimensional euclidean space considered as the product of an m-dimensional 
and an n-dimensional euclidean space. 

Because A XB is analogous to the plane, the subset of points (x, y) such 
that x=a is called a vertical section of A XB and is denoted by aXB or 
(x=a); similarly the subset of points (x, y) such that y= is called a hori- 
zontal section of A XB and is denoted by A Xd or (y=6). If Eis any subset 
of AXB the set E-(x=a) is called a vertical section of E and the set 
E-(y=b) is called a horizontal section of E. 


* Presented to the Society, November 25, 1932; received by the editors February 1, 1933. 

t Fundamenta Mathematicae, vol. 19, p. 247; see also an article by Kuratowski in vol. 17, p. 275. 

¢ Fundamenta Mathematicae, vol. 1, p. 112. 

§ Rendiconti della Reale Accademia dei Lincei, (5), vol. 16, I. For a statement of Fubini’s 
theorem see also Carathéodory, Vorlesungen tiber Reelle Funktionen, 1927, p. 621. 

|| Annali di Matematica, 1899, p. 1. 

] Mathematische Zeitschrift, 1919, p. 306. 

** Fundamenta Mathematicae, vol. 14, p. 237, and vol. 19, p. 184. 

tt If p and qg are any two points of a metric space, (fg) denotes the distance between p and gq. 


915 


916 DEANE MONTGOMERY [October 


If E is closed, all horizontal and vertical sections of E are closed, and if E 
is open, its horizontal and vertical sections are open (relative to the sec- 
tions of A XB which contain them). A similar proposition is true for sets F 
or O of type a. Converse propositions are not true. The plane set of points 
(1/n, 1/n), where takes all integral values, is such that each of its horizontal 
and vertical sections contains at most one point and is therefore closed. The 
point (0, 0) is a limit point of the set which is not in the set. Sierpinski* has 
constructed a plane set every section of which (not merely horizontal and 
vertical sections) contains at most two points and which is non-measurable 
in the Lebesgue sense. This example shows that the fact that every horizontal 
and vertical section of E is of type a@ is not a sufficient condition that EZ be 
of type a, and that in order to obtain such a sufficient condition, further re- 
strictions on the sections or on the relations between them must be imposed. 
By restricting the vertical sections to a type of set called J-set (or the comple- 
ment of such a set) sufficient conditions may be obtained that a set be of 
various types. This is done in §3. Necessary and sufficient conditions that 
sets with restricted vertical sections be of class a are given in §6. Uses of 
sets called gratings are considered in §7. Theorems are given which show 
that boundaries of sets with certain kinds of sections lie on sets of lines of the 
first category. The results are applied in §8 to prove a theorem of Baire con- 
cerning functions of two variables continuous in each of them and to obtain 
a result regarding Kempisty’s generalization of this theorem. 


2. HORIZONTAL SECTIONS OF CLASS M 


The following definitions will be useful. 


DEFINITION 1. If the inner points of a set are dense on the set, the set is 
called an I-set. 


A set may have this property with respect to A XB or it may be a subset 
of a section of A XB and have this property with respect to the section, this 
latter being the case which will most often arise. 


DEFINITION 2. Given a point (a, b), the set of points (a, y) where (by) <r, 7 
a positive number, is called an open vertical interval of center (a, b) and radius r. 


A closed vertical interval is defined in the same way except that (by) Sr, 
Closed and open horizontal intervals may also be defined. A vertical inter- 
val might also be defined as aXS where S is a sphere in B of center 6 and 
radius rf. 


* See the previous reference. 


{ 


1933] SECTIONS OF POINT SETS 917 


DEFINITION 3. If a set G lies on a horizontal section, G(r) is the set of points 
of open vertical intervals of radii r and centers at the points of G. 


DeriniTion 4. If 2 is a family* of point sets M lying on horizontal sections 
of AXB, M(r) is the family of all point sets M(r) for r ranging over all positive 
numbers. 


If AC is a family of point sets, 2,(Ms) denote, as is conventional, the 
families composed of all possible sums (products) of an enumerable number 
of sets of A. 

A set of vertical sections K is said to be everywhere dense if the set of 
points [a], such that (x=a) is in K, is everywhere dense in A. In a similar 
manner other point-set properties, for example the property of being in the 
first or the second category, are ascribed to sets of vertical sections and to 
sets of horizontal sections as well. 

By the projection of a point (x, y) on (y=b) is meant the point (x, bd). 
The projection of a set of points E on (y=d) is the set of points formed by 
projecting all the points of EZ on (y=d). 


THEOREM 1. If each vertical section of E is an open set whose complement is 
an I-sett and horizontal sections of E belong to M, then E belongs to the family 
[Ms(r) 


Since B is separable there exists an enumerable everywhere dense set 
(y=6,) of horizontal sections of A XB. Let r; be a sequence of positive num- 
bers approaching 0 as a limit. Let K; be the projection of E-(y=6,) on 
(y=5;). Let Aim be the product of E-(y=6,) and all sets K; such that 
(b;b;)<rm. The set Aim(rm) is a subset of EZ. To prove this, suppose that 
Aim(fm) contains a point (a, b) of CE. The point (a, },) is in Aim(rm) and 
also all points of the open vertical interval of radius r,, and center (a, 5;) 
are in Aim(rm). Since (a, 5) lies in this open vertical interval, there is by hy- 
pothesis some inner point (a, c) of CE in this interval. There is then an 
e such that if (cy) <e, (a, y) is in CE. Since the 6;’s are everywhere dense in 
B there is some b;, say b,, such that (cb,,) <e. The point (a, b,) is then in CE. 
This 5, may be so chosen that (b:b,)<rm. The set E-(y=b,) does not con- 


* It is here supposed that if E consisting of points (x, 5) is in Jy{, the set of points (x, c), where x 
has the same range as in E and c is any point of B, is also in Jy{. This restriction is made for con- 
venience. It is necessary for the proofs of some of the following theorems in which use is made of the 
projections of sets from one horizontal section to another. 

t A more explicit statement is as follows: If V is any vertical section of A XB, V- E is open in V 
and V—E is an I-set in V, etc. Language similar to that in the hypothesis of Theorem 1 will be used 
throughout, with the meaning given in this note. 

¢ With respect to the section (x =a). 


918 DEANE MONTGOMERY [October 


tain (a, b,) and K,, will not contain (a, b;). Therefore (a, 5;) will not be in 
Aim(rm). From this contradiction, it follows that Aim(rm) is in E. 

To prove that every point of EZ is in some Aim(rm), let (c, d) be any point 
of E. There is an ¢ such that if (dy) <e, (c, y) is in E. Choose r,, and 6; such 
that bd<r,<¢/2. Every point (c, y) of the open vertical interval of center 
(c, b;) and radius 7» is in E. For since (byy) <¢/2 and (bid) <«/2, it follows 
from the triangle axiom that (dy) <e. Therefore the set K; contains (a, b,) 
if (b:b;)<rm, and Aim(rm) contains all points (c, y) such that (d;y) It 
must then contain (c, d) since (bid) <rm. 

Thus E=)>imAim(%m), Which proves the theorem. 


THEOREM 2. If vertical sections of E are closed I-sets and horizontal sections 
of E belong to the family MM, then E belongs to the family [2t.(r) |.s. 


Let (y=5;) be an enumerable everywhere dense set of horizontal sections 
of A XB and let (r;) be a sequence of positive numbers approaching 0. Let K; 
be the projection of E-(y=b,) on (y=b,). Let Aim be the sum of E-(y=5,) 
and all sets K; such that (b;b;) <rm. The set Em= iA im(?m) isa member of 
the family [2,(r)].. The set E,, contains E, for all m. To prove this let 
(c, d) be any point of E. By hypothesis there is for each 7, a vertical interval 
V., of radius e, which contains only points of E and for every point (c, y) of 
which (dy) <rm. There is some (y=b,) which cuts V, in a point (c, b;). The 
point (c, b;) is then in E. The set Aim(rm) contains (c, d) because (db;) <rn. 

The set £ is therefore included in [| mEm. To prove that E=[| Em, let 
(a, b) be any point of CE. There is an ¢ such that if (by) <e, (a, y) isin CE. In 
order that (a, 6) be in Aim(rm) it is necessary that (bb,;) <rm. Choose rm <e/2. 
If (bb;)<rm, and (by) <rm, it follows that (by) <e, and consequently (a, y) 
is in CE. Therefore if (6b;)<rm, Aim(rm) cannot contain (a, b) because all 
points on the open vertical interval of center (a, b;) and radius 7, are in CE. 
As has been mentioned, Aim(rm) cannot contain (a, 6) if (bb;)=rn. There- 
fore for r» chosen as it has been, (a, 3) is not in Z,, and consequently not in 
I] »Z-. It follows that E=]],.£, which completes the proof of the theorem. 


3. HORIZONTAL SECTIONS OF THE TYPE @ 


Lema 1. Let G be a set lying on a horizontal section of AXB. If G is an 
O.(a@=0) or an F,(a>0), G(r) is an O, or an F,.* If G is analytic, G(r) is 
analytic. 

The first part of the lemma is true for sets Op and sets O;, and can be 
shown to be true for sets O. by transfinite induction. The latter part of the 
lemma may be readily proved from the definition of an analytic set. 


* For a discussion of sets Fz and Og, see de la Vallée Poussin, Intégrales de Lebesgue, p. 132. 


1933] SECTIONS OF POINT SETS 919 


THEOREM 3. If the horizontal sections of E are O.’s and the vertical sections 
are closed I-sets, then E is an Fa41. 


This follows from Theorem 2. 2 is here the family of O,’s in horizontal 
sections of A XB. 2, is the same family and by the preceding lemma 2,(r) 
is a family ef O.’s in A XB, as is [2,(r) |. Therefore [2,(r) ].s is a family of 
F.4:’s in A XB and E must be an 

By taking complements the following is proved: 


THEOREM 4. If the horizontal sections of E are F.,’s and the vertical sections 
are open sets whose complements are I-sets, then E is an Oa+1. 


That the change in classification mentioned in Theorems 3 and 4 actu- 
ally may occur, is shown by the following plane set. On the line y =x, take a 
set E which is an O.4:(a21)* and an O of no lower class.f Then E=)_;A; 
where A; is an F, at most. At each point of A; erect a vertical interval of 
length 1/2, closed at the end touching y= x and open at the other end, and 
denote the set thus obtained by H. Horizontal sections of H are F,’s. This 
can be seen as follows. If Z is any horizontal line and # any point on this line 
above the line y=x, vertical intervals from only a finite number of the sets 
A; can cut I to the left of, or at, ». Therefore the points of H on L to the left 
of or at p must form an F,. This is true however close p may be to y=x. Let 
p» be a sequence of points on L above y=x, approaching y=x. Let E, be 
the points of (CH)-L to the left of cr at p,. Then E, is an O., and >> E, is 
an O,. Therefore the points of CH on L to the left of y=x form an O, and 
consequently the points of H on L form an F,. Since a1, it does not matter 
whether or not the intersection of ZL and y=~ is in H. Although horizontal 
sections of H are F,’s, the set H itself must be at least an 0.4: since y= cuts 
it in an Oq41. 

Denote by R that part of the complement of H which lies on or above 
y =x. Horizontal sections of R are O,’s but R itself is an F a+ifat least, since 
y =x cuts it in an Fa41. By Theorem 3, Ris an F.4: at most. This example 
shows that under the hypothesis of Theorem 3, it is impossible to draw a 
stronger conclusion on the F classification of E than the one there given. 


THEOREM 5. If the horizontal sections of E are O.’s and the vertical sections 
are open sets whose complements are I-sets, then E is an Oax2. 


This follows from Theorem 1. The family A is here the family of sets 
O, on horizontal sections of A XB. 2; is then the family of sets F.4: on hori- 


* A single open vertical interval furnishes an example in case a=0. 
+ For a proof of the existence of functions of all classes (which proves the existence of sets of all 
classes) see de la Vallée Poussin, Intégrales de Lebesgue, p. 145 ff. 


4 
hi 
4 


920 DEANE MONTGOMERY [October 


zontal sections. 2;(r) contains only sets F.4: in AXB by Lemma 1. There- 
fore ]. contains only sets 

By taking complements there is proved 

THEOREM 6. If the horizontal sections of E are F,’s and the vertical sections 
are closed I-sets, then E is an Fas. 


Whether or not the classification may actually be increased by two under 
the hypothesis of Theorems 5 and 6 is an open question. That an advance 
of one may occur is shown by the following plane set. Construct the set H 
as in the preceding example, except that the vertical intervals are now to be 
closed instead of half closed. Horizontal sections of this set are F,’s as before 
and the set itself must be an 0.4: exactly as H was. 

The two following theorems may be proved directly or as a result of the 
preceding theorems on sets O, and F,. 


THEOREM 7. If horizontal sections of E are A,’s *and vertical sections of E 
are all closed I-sets or all open sets whose complements are I-sets, then E is 
an 

4. ANALYTIC OR MEASURABLE HORIZONTAL SECTIONS 


If MM is the family of analytic sets, At, and WM; are the same family, from 
which we have the following theorem. 


THEOREM 8. If horizontal sections of E are analytic and vertical secitons of 
E are all closed I-sets or all open sets whose complements are I-sets, E is analytic. 

If there is a theory of measure in the space under consideration, as for 
example in the plane, a theorem similar to Theorem 8 is true for measurable 
sets. 


5. Tue set E, 


Let EZ, denote the subset of E each point of which lies on a closed vertical 
interval of radius exactly e, which contains only points of E. 

For the two theorems of this section the conditions on A and B are that 
they are metric, that every closed sphere in B is compact and that inner 
points of a closed sphere in B are dense on the sphere. 


THEOREM 9. If horizontal and vertical sections of E are closed, E, is closed. 
Let (a,, 6,) be a sequence of points of EZ, converging to a limit point (a, 5). 
Each (a,, 5,,) lies on a closed vertical interval, V.", containing only points of EZ, 
of radius e, and of center (a,, c,). An infinite number of points b, are such that 
(bb,,) <«€, where ¢€ is any positive number. The fact that (b,c,) Se for all n 


* For a discussion of sets Aq, see de la Vallée Poussin, Intégrales de Lebesgue, p. 135. 


1933] SECTIONS OF POINT SETS 921 


implies that for an infinite number of points c,, (bc,,) <e+, and by hypothesis, 
these points have some limit point ¢ such that (bc) Se. Let V. be the closed 
vertical interval of center (a, c) and radius e. Let (a, y) be any point such 
that (cy) <e, that is, any point on the interior of the vertical interval V,. It 
will now be shown that there is an for which (aq, y) is in EZ and (a,a) <n, 
where 7 is any positive number. Consider the m’s for which (a,a) <n and 
select from this group an m such that (c,c)<e—(cy), that is, such that 
(cnc) +(cy) <e. It follows that (c,y) <e and therefore the point (a,, y) is in E. 
On the horizontal section (y=) there is thus a sequence of points of E ap- 
proaching (a, y), and since horizontal sections of E are closed, (a, y) must be 
in E. Therefore, all inner points of the vertical interval V, are in Z. Because 
of the hypothesis on the space B and the fact that vertical sections of £ are 
closed, it follows that every point of V, is in E. Since (bc) Se, the point (a, b) 
is in V., which proves that every limit point of Z, isin E,. 

For Theorem 10, B is required to have the further property that any 
point ~ of B on a closed sphere of radius >r is on a sphere of radius exactly 
r, which is contained in the first sphere. 


THEOREM 10. If horizontal and vertical sections of E are closed and each 
point of E lies on a closed vertical interval containing only points of e, then E 
is an F,. 


Let (r;) be a sequence of positive numbers approaching 0. Any point in 


E, is in E,, if r;Sr. Hence E=)>°E,, and since E,, is closed, E is an F,. 


6. UNIFORMITY PROPERTIES 


DEFINITION 5. A point p is said to be a point of uniformity of E if, for 
some open sphere S in AXB of center p and for some e, E-S=E,-S. 

DEFINITION 6. A point p is said to be a point of uniform separation of E if 
it is a point of uniformity of CE. 

The points of uniformity of Z form an open set. 

For the theorems of this section, A and B are metric separable spaces. In 
addition it is required that every sphere in B be totally limited* and that B 
have the property stated just before Theorem 10. 

THEOREM 11. If the vertical sections of E are open, a necessary and sufficient 
condition that E be an O, is that the set of points of non-uniform separation of E 
in E be an O, and that horizontal sections of E be O,’s. 


The necessity will first be demonstrated. If Z is an O, its horizontal sec- 
tions are O,’s. Let N denote the set of points of non-uniform separation of E. 


* See Hausdorff, Mengenlehre, 1927, p. 108. 


; 
i 

i 

/ 


922 DEANE MONTGOMERY . [October 


Since N is closed, it follows that E- N is an O, if a is greater than 0. For a=0, 
the necessity is obvious. 

The sufficiency will now be demonstrated. Let p be any point of EZ which 
is a point of uniform separation of EZ. For some open sphere S of center p 
and some e, S-(CE).=S-(CE). Let r; be a sequence of positive numbers ap- 
proaching 0 and let (y=6;) be an enumerable everywhere dense set of hori- 
zontal sections of A XB. Let r,, be a fixed element of the sequence 7; and let 
e be the smaller of the two numbers e/4 and r,,/4. Let S; be a sphere with the 
same center as that of S and with radius 2e larger than that of S. Because 
the projection of S, in Bis a sphere in B it is totally limited in B by hypothesis. 
There exists then a finite set of horizontal sections (y=h,) such that every 
point of S; is a distance less than ¢ from some one of them. Let K; be the 
projection of S-E-(y=h;) on (y=6;). Denote by Ai» the product of S-E 
-(y=6,) and all K;,’s such that (h,b;) <rm. 

In order to show that Ain(r/2) is in S-E, let (a, y) be any , intof 
Aim(?m/2) and suppose that (a, y) is in C(S-£). It is then on a closed vertical 
interval V., of radius e, containing only points of C(S-£). By the hypothesis 
on the space B, the point (a, y) is also on a vertical interval V., of radius 
exactly e, V, being contained in V,. The interval V, contains only points of 
C(S-E). Let (a, b) be the center of V,. Since (a, 6) must be in S; there is 
some h,, such that (bh,) <e. The point (a, h,) is therefore in C(S-£). From 
the relations (b:y)<rn/2 and (by) <e, it follows that Since 
(bh,,) <e, it follows that (h,b;)<irm+e<rm. The point (a, #,), being in 
C(S-E), cannot be in S-E-(y=h,). Therefore (a, 5;) is not in K, nor in Aim. 
Neither (a, b;) nor (a, y) can then be in Aim(rm/2). From this contradiction 
it follows that Aim(rm/2) is in S-E. 

The proof showing that each point of S- Zis in some A in(7m/2) is analogous 
to the proof of a similar proposition given in the demonstration of Theorem 
1, and will not be repeated here. Assuming this to be proved, S-E 
=)-imAim(?m/2). The set Aim is a finite product of O.’s and must be an O.. 
Each Aim(rm/2) must then be an O,, and therefore S-£ is an O,. 

Every point of uniform separation of E is therefore the center of an open 
sphere S such that S-E is an O,. By Lindeléf’s* theorem an enumerable set 
(S;) of such spheres cover the points of uniform separation of Z in Z. The 
set N of points of non-uniform separation of Z in E is an O, by hypothesis. 
Therefore E=)°;S;-E+N is an 


* This holds since A XB is separable when A and B are. See the previously cited paper by 
Kuratowski and Ulam. 


1933] SECTIONS OF POINT SETS 923 


THEOREM 12. If the vertical sections of E are closed, a necessary and suf- 
ficient condition that E be an F,(a>0) is that the set of points of non-uniformity 
of Ein E be an F, and that horizontal sections of E be F,’s. 

The necessity will first be demonstrated. Horizontal sections of E are 
F,’s since they are the products of EZ and horizontal sections of the space 
AXB. The set N of points of non-uniformity of £ is closed and the product 
of N and E must be an F,. 

The sufficiency will now be shown. Let p be any point of uniformity of EZ. 
It is a point of uniform separation of CE and for some sphere S of center #, 
(CE)-S is anO, by Theorem 11. As before, an enumerable number of such 
spheres, S;, cover the points of uniformity of Z. Since the points of CE in 
>-iS; are an O,, the points of E in >>;S; form an F,. The set N of points of 
non-uniformity of E in E form an F, by hypothesis. Since E=)),E-S;+N, 
Eis an F,. 

The theorem is not true for a=0. 


7. GRATINGS AND CATEGORICITY 
In this section A and B are to be metric, separable, locally compact* 
spaces. In such spaces the complement of a set of the first category is of the 
second category, and open sets are of the second category. These propositions 
may be proved by a method similar to the method used for proving them in 
euclidean space. It is necessary to make use of the fact that every monotonic 


decreasing sequence of non-null compact spheres has a non-null product.f 

DerinitTI0n 7. If a horizontal section (y=b) contains a set H of the second 
category [in (y=b) |, such that H(r) is in E for some r, (y=b) is said to have 
property C with respect to E. 

Derinition 8. If a horizontal section (y=b) contains a set H, in and every- 
where dense in a set O [in and open in (y=b) | such that H(r) is in E for some 
r, (y=b) is said to have property D with respect to E. 

The set H(r) of Definition 8 is called a grating. A point is said to be 
within the grating if p is in O(r). A point p is om the grating if it is in H(r).t 
Property C implies property D since a set of the second category must con- 
tain a subset everywhere dense in some open set. 

Lemma 2. If K, a set of vertical sections, is of the second category and if 
each KeK contains a vertical interval including only points of E, then some hori- 
zontal section has properties C and D with respect to E; furthermore the center 
of one of the vertical intervals is on the grating (of property D). 

* Fora definition of this term see Fréchet, Les Espaces Abstraits, p. 223. 


t See Banach, Théorie des Opérations Linéaires, pp. 13 and 14. 
t If p is on a grating it is within the grating. 


4 


924 DEANE MONTGOMERY [October 


Let K3. be the set of vertical sections containing vertical intervals of E 
of radii >3e, where e has been so chosen that K;, is of the second category. 
Let Vs, denote an individual one of these vertical intervals of radius >3e and 
let > Vs. denote the points in all such intervals. From each Vs, form a vertical 
interval V, of radius exactly e with the same center as V;,. Each interval V, 
consists entirely of points of EZ. Let (;) be an enumerable set of points every- 
where dense in B, and let B; be a sphere in B of center b; and radius 2e. Let 
A; be the subset of A such that for each aeA;, there is some V, and a corre- 
sponding V;, for which V.<aXB;<V»%. 

It will now be shown that for each V, and corresponding V;., of center 
(a, 6), there is an i such that V.<aXB;<Vs., and consequently that >A; 
is the set in which the sections of XK, cut A. In order to do this, choose i such 
that (bb;)<e. For each point (a, y) of V., (by) <e. By the triangle axiom, 
(by) <2e. This shows that the vertical interval aX B; of center (a, b;) and 
radius 2e includes V,. It will now be shown that aXB;<V;. If (a, y) is 
any point of aXB;, (by) <2e. Since (0b;)<e, it follows from the triangle 
axiom that (by) <3e which is the condition that (a, y) be in Vx. 

Since >> A; is the set in which the sections of K;. cut A, it follows that 
>A; is of the second category and, consequently, that some particular Aj, 
say A,, is of the second category. The section (y=)d,) has property C with 
respect to E, for the set A, Xb, is of the second category in (y=d,), and 
each point of A, Xb, is the center of a vertical interval of radius 2e which in- 
cludes only points of E. The section (y=d,) must then also have property 
D with respect to E. Since each of these vertical intervals contains an inter- 
val V, it must contain the center of this interval V, which is the center of the 
corresponding original interval V;.. Therefore the center of one of the original 
intervals is on each vertical interval of the grating. 

It is evident that when A and B have similar properties, the parts played 
by horizontal and vertical sections in any theorem may be interchanged. 


DEFINITION 9. A point is said to be of the second category with respect to E 
if every neighborhood of the point contains a subset of E of the second category. 


The set of ail points of the second category with respect to E is denoted by 
E,.. The set E,. is closed. 

A necessary and sufficient condition that E be of the second category is 
that E,. be of the second category. This implies that if Z is of the second cate- 
gory, it must be of the second category at a set everywhere dense in an open 
set, and since E,,. is closed it must be of the second category at each point of 
an open set.f 


t See Banach, Théorie des Opérations Linéaires, p. 13 and the reference there given. 


1933] SECTIONS OF POINT SETS 925 


THEOREM 13. If vertical sections of R are I-sets and horizontal sections of T 
are I-sets, and R-T =0, then R'-T+R-T" is of the first category in A XB. 


It is sufficient to show that R-T7” is of the first category in A X B. Assume 
that R-T’ is of the second category in A XB. It must then be of the second 
category at every point of a set O, open in A XB, which implies that R and T 
are both dense in O. It follows from the hypothesis, that on each vertical 
section containing a point of R- 7” -O, there is a vertical interval V containing 
only points of R and such that V is in O. Since the vertical sections containing 
points of R-T’-O form a set of the second categoryt, Lemma 2 may be ap- 
plied. By this lemma, there is a horizontal section L containing a set H 
everywhere dense in O* (O* a set open in L) such that H(r) is in R. The set 
H(r) is also in O since the intervals V from which it is constructed are in O. 
Suppose there is a point p of T within H(r) and let the horizontal section 
containing p be L*. The section L* must contain an inner point (with respect 
to L*) of T which lies in O*(r). But this is impossible because L*- H(r) is dense 
in L*-O*(r) and H(r) contains only points of R. There can be, then, no points 
of T within H(r), but this is a contradiction since JT must be dense in O. 
Therefore R-T” is of the first category in A XB. 


Coroiary 1. If vertical sections of E are I-sets and horizontal sections of 
CE are I-sets, then E’-(CE)+E.-(CE)’ is of the first category in A XB. 


This follows immediately from the theorem and the fact that E-(CE) =0. 


Coroiiary 2. If vertical sections of E are I-sets and horizontal sections of 
CE are I-sets, there is an inner point of either E or CE in every set O, open in 
AXB. 


This follows from Corollary 1. Points not belonging to E’ - (CE) + E-(CE)’ 
are inner points either of E or of CE and this set is everywhere dense in 
AXB. 


THEOREM 14. If horizontal and vertical seciions of E are I-sets and hori- 
zontal sections of CE are I-sets, then E is an I-setin A XB. 


Let p be any point of Z lying on a horizontal section L and let O be any 
open set in A XB containing p. It is necessary to show that O contains an 
inner point of Z. By hypothesis L-£ contains a set O* open in L. The set O* 
is of the second category in L, and each vertical section K cutting O* must 
contain a vertical interval V including only points of Z and lying in O. From 
these V’s, there may be formed a grating H(r) containing only points of E 
and contained in O. No point of CE can be within this grating because hori- 


T If this were not true, R- 7’-O would be of the first category in A XB. 


iy 
| 


926 DEANE MONTGOMERY [October 


zontal sections of CE are J-sets. The argument is similar to the one in Theo- 
rem 13 and will not be repeated. 
Kuratowski and Ulam have a theorem similar to the following: 


THEOREM 15. If E is a set whose horizontal sections are I-sets, and O is an 
open ~et in AXB, a necessary and sufficient condition that E-O be dense in O, 
is that the vertical sections L for which L-E-O is not dense in L-O form a set 
L of the first category. 


The sufficiency of the condition follows from the fact that if vertical sec- 
tions K, such that K-£E-O is dense in K-O, form a set complementary to a 
set of the first category, they are everywhere dense and therefore the points 
in K-E-O, considering all K, must be dense in O. 

In order to prove the necessity, let E-O be dense in O and suppose the 
set £ to be of the second category. On each L there is a vertical interval which 
is in O and which contains no points of £, that is, it contains only points of 
CE. By Lemma 2, CE must contain a grating which is in O. But this grating 
can have within it no point of EZ since horizontal sections of E are J-sets. 
This contradicts the hypothesis that Z is dense in O, and the theorem is 
proved. 


THEOREM 16. If vertical sections of R are open, horizontal sections of T are 
I-sets and R-T =0, then R-T' lies on a set K, of vertical sections, which is of 
the first category. 


Suppose that X is of the second category. Each KeX contains a vertical 
interval with a point of R-T”’ as center and containing only points of R. By 
Lemma 2, there is a grating, composed of points of R, containing a point of 
R-T’ on its interior. This is impossible because horizontal sections of T are 
I-sets. 


Coro.iary 3. If vertical sections of E are closed and horizontal sections of 
E are I-sets, then (CE) -E’ lies on a set K, of vertical sections, which is of the 
first category. 


This corollary may be proved by replacing R and T by CE and E£ in 
Theorem 16. 

If there exists a set X of vertical sections, and a set £ of horizontal sec- 
tions, so that every point of E lies either on a member of Kor a member of £, 
the set E£ is said to lie on the set K, plus the set £. This language is used to 
distinguish this case from the case in which every point of £ lies both on a 
member of XK, and on a member of {; in this latter case E is said to lie on the 
set K and on the set £. 


1933] SECTIONS OF POINT SETS 927 


Coroxiary 4. If vertical sections of R are open, horizontal sections of T 
are open and R-T =0, then R':T+R.-T' lies on a set K, of vertical sections plus 
a set L of horizontal sections, both K, and L being of the first category. 


This is a slightly stronger conclusion than that of Theorem 13, made 
possible by the stronger hypothesis given here. 


Coro.iary 5. Jf vertical and horizontal sections of both R and T are open, 
and R-T =0, then R'-T+R-T" lies on a set K, of vertical sections and a set L 
of horizontal sections, both Kand £ being of the first category. 


In Corollary 5, the projection of R’- T7+R-T’ on any horizontal or vertical 
section must be of the first category in the section. This is not necessarily 
true in Corollary 4. 

It will be assumed in the following theorem that the space A is dense in 
itself in order that the set E there considered may be perfect. It will also be 
assumed that A and B have the properties necessary to apply Theorem 10. 

A point of closure of a set Z is a point in some neighborhood of which £ 
is closed. 


THEOREM 17. If horizontal and vertical sections of E are closed and each 
point of E lies on a closed horizontal interval containing only points of E, 
points of closure of E in E are dense on E. 


By Theorem 10, E is an F,. It must then be an F, in E=E+E’.t It is 
necessary to prove that E—E is nowhere dense in E or, in other words, that 
limit points of E not in E are not dense on E. It will be shown that E—E is 
of the first category. By Corollary 3, E—E (which is the same as (CE) - E’) 
lies on a set K, of vertical sections, of the first category. Let Ao be the set of 
points in which the sections of K, cut A. The set Ag=)_A; where each A; is 
nowhere dense in A. Let R; be the points of E—E lying on those sections, of 
the set K,, which cut A in A;. If R; were dense on some portion of E, it would 
have to have as a limit point every point of some horizontal interval. This 
follows from the hypothesis that every point of E lies on a horizontal inter- 
val. This is impossible since the projection, A;, of R; on A, would then be 
everywhere dense in some open set in A. Therefore R; is nowhere dense in E, 
and E—E=)_R,; is of the first category in E. It follows that E—E is nowhere 
dense in E, for E—E is a G, and if a G; is of the first category, it is nowhere 
dense. 


t In this particular case E = E’. 
¢ For example see Blue, Mathematische Annalen, vol. 102, p. 627, in the proof of Theorem 1. 


DEANE MONTGOMERY 


8. APPLICATIONS 


The spaces A and B are restricted here in the same manner as in §7. An 
interesting application of Corollary 5 is in the proof of the following result 
of Bairet: 

If f(x, y) is a real-valued function defined on the space A XB and is con- 
tinuous in each of the variables separately, points of discontinuity of f(x, y) 
lie on a set of vertical sections and a set of horizontal sections, both sets of 
sections being of the first category. 

Let (r,;) be the set of rational numbers. Let R; be the points of A XB at 
which f(x, y) >r; and let 7; be the set at which f(x, y) <r;. From the proper- 
ties of continuous functions, R; and T; have open horizontal and vertical 
sections and are disjoined. It follows that R;’-7;+R;-T/ lies on a set of 
horizontal and a set of vertical sections of the first category, and the same 
is true of >> ;;(R;’-7;+R;-T,;’), the sum being taken only over pairs of i and 
j for which r;<7;. The points of this sum are the points of discontinuity of 
f(x, 9). 

Applying Corollary 4 in the same way to a function upper semi-continuous 
in one variable and lower semi-cor*inuous in the other, it may be shown that 
the set of discontinuities, Z, of such a function lies on a set of horizontal sec- 
tions plus a set of vertical sections of the first category. The set £ is of the 
first category, but not all sets of the first category lie on a set of horizontal 
plus a set of vertical sections of the first category. This conclusion therefore 
contains a result not given by Kempisty.f 


t Acta Mathematica, 1899, p. 94. 
¢ Fundamenta Mathematicae, vol. 14, p. 237. 


UNIVERSITY OF Iowa, 
Towa City, Iowa 


INVARIANTS OF PFAFFIAN SYSTEMS* 


BY 
MABEL GRIFFIN 


1. Introduction. The developments in this paper are based on a series of 
invariant sets of forms associated with a given pfaffian system. These forms 
are obtained by exterior multiplication of the given pfaffians and their de- 
rived forms. The first set 2‘ of r forms, obtained in turn by multiplying the 
product of all the given pfaffians by each of the derived forms, has been em- 
ployed by Cartan. The vanishing of this set is a necessary and sufficient con- 
dition that the system be passive. The present paper interprets the vanishing 
of the second set in the light of the notion of a primitive system, i.e., one whose 
derived system is the given set of equations. 

Associated with any pfaffian system are invariant pfaffian systems, formed 
by equating to zero the linear factors common to any set of forms, Q*/:--*, 
Any arithmetical invariant of one of the latter systems is invariant for the 
former; such invariants are the number of equations in the system, the class, 
the species, etc. 

Another arithmetical invariant arises from the fact that every system pos- 
sesses a primitive system. 

The importance of arithmetical invariants for pfaffian systems lies in their 
usefulness in determining the non-equivalence of pfaffian systems. Riquiert 
has given methods which may be applied to determine whether or not two 
pfaffian systems are equivalent, but the algebraic operations involved in their 
application are in general too complicated to carry out, whereas the com- 
parison of arithmetical invariants will often settle the question. 

The ’s are used (§8) to give a criterion for reducibility to a canonical 
system of a particular type, designated as completely separable, and to state 
necessary and sufficient conditions for the equivalence of such systems. 

The reader is assumed to be familiar with the contents of Goursat’s 
treatise.t 

2. The fundamental invariant forms. Consider the pfaffian system 


(2.1) S: w! = 0, w? = 0,---,w7 = 0. 


* Presented to the Society, December 27, 1932; received by the editors February 18, 1933. The 
results in this paper are taken from a doctoral dissertation in mathematics presented at Duke Uni- 
versity, June, 1933. 

C. Riquier, Les Systémes d’ Equations aux Dérivées Partielles, Paris, 1910. 

} E. Goursat, Lecons sur le Probléme de Pfaff, Paris, 1922. 


929 


i 


930 : MABEL GRIFFIN 


Let the symbol *:':--** be defined as follows: 


(2.2) = ww? we! ts w’ tk, 


where i;i.---%, represents any combination of numbers selected from 

1,2, ---,7andw’ is the derived form of w. The forms Q*:*::-# are invariants 

of the system S. Moreover they are symmetric in every pair of superscripts. 
Assuming the non-singular transformation 


(2.3) = aéw*,a =| af | 0, 
we have 
= ad w'* + dad w%, 
where the repeated indices on the right indicate summation over the range 


1, 2,---,,r. This transformation induces on the ’s the linear homogeneous 
transformation 


te 


(2.4) Q = da,2 


Conversely, there exists a transformation (2.3) which induces a given 
transformation 


(2.5) = 


on the ’s provided the a’s satisfy 


(2.6) = aP(aciac, ant), 
where P indicates the summation of all terms obtained by permuting 
- The conditions for compatibility of (2.6) are 


If k=1, conditions (2.7) are identically satisfied. 
3. Linear dependence of the 2’s. Expressing that the relations 


(3.1) = 0 


are identically satisfied in the differentials gives a set of linear homogeneous 
equations on the C’s, whose rank can be proved invariant under transforma- 
tions (2.3). We shall employ Sylvester’s term “nullity” as a name for the in- 
variant ~;, which is the number of linearly independent solutions of (3.1). 

In the case of the 2’s with a single index, by a transformation (2.3) an 2 


[October 
| 


1933] INVARIANTS OF PFAFFIAN SYSTEMS 931 


can be made zero corresponding to each relation (3.1). This gives the theory 
of the derived system,* which is an invariant system of the original. 
Suppose it is possible to make the following p ’s of order k>1 vanish: 


(3.2) QU---1 = = = 0, 


where ? is the corresponding nullity. It can be proved by methods similar to 
those developed in greater detail in §8 that any transformation (2.3) which 
leaves (3.2) invariant permutes the first equations of S among themselves. 
Those equations therefore form an invariant system of S. 

4. Further invariant systems associated with a pfaffian system. All the 
0’s of sufficiently high order for a given system are zero since the degree of 
Q4:*2"--* finally exceeds the class of the system. Suppose every 2 with p+1 
indices is identically zero, whereas some Q with p indices does not vanish. 
Then 2p is an arithmetical invariant of the system. It is, in fact, easily identi- 
fied with the invariant defined in a different manner by Engel} and called by 
him the rank of the system. For gp let S, be defined as the system com- 
posed of the equations formed by setting all common factors of Q‘:*:"--“« equal 
to zero. The system S, is always contained in S,,:. We have then a sequence 
of invariant piaffian systems all of whose arithmetical invariants are also 
invariants for S. 


THEOREM 1. A system S is of species one if and only if the corresponding S, 
is passive and does not coincide with S.t 

THEOREM 2. If S is of rank two, it can be put in a form satisfying 

0, = 0, = $793, 

(4.1) 
w’t! = = $19’, mod w!, w’,- - , w’, 
or 
(4.2) = = 0, = =z oy’, 


mod w!, w?,---,w". 


In the first case, S:=S; in the second, S:>S. Conversely, if S:>S, then S is of 
rank two and satisfies (4.2). 

The proof of Theorem 2 follows. Since S is of rank two, every Q of order 
two must vanish. The vanishing of Q2*/ when 7 =] indicates that every w’ must 
vanish or be of rank two mod w’, - - - , w’; thus the derived form of every w 


* Goursat, p. 294, 

t F. Engel, Leipziger Berichte, vol. 52 (1890). This invariant is zero if and only if the system is 
passive. 

t For the definition of species and for material which facilitates the proof of the above theorem, 
see J. M. Thomas, Pfaffian systems of species one, these Transactions, vol. 35 (1933). 


| 
| 
= 
/ 


932 MABEL GRIFFIN [October 


not contained in the derived system is of rank two mod w', - - - , w’. Since 
this is true, the vanishing of 2‘? when 7+7 implies that every pair of 
non-vanishing derived forms possesses a common factor. This is possible in 
only two ways: S must satisfy (4.1) or (4.2). When (4.1) is satisfied, the de- 
rived system contains r —3 equations. 

5. Primitive systems. If = has S for its derived system, = will be called 
a primitive system of S. 


THEOREM 3. Every pfaffian system has a primitive system. 
Let us assume that the derived system of 
(5.1) S: =0,---,0” =0 
is 
(5.2) w = 0,---,0” =0. 
The first r’ derived forms of S then vanish by virtue of the system. When 
reduced by (5.1), the non-vanishing forms are quadratic in ¢',--- , $"~", 


which with the w’s constitute an independent set of forms. Consequently, 
they vanish by virtue of 


(5.3) = 9! = 0, ol= — = 0. 


If in addition the \’s form with the original variables an independent set, 
the derived form of no left member of this set of equations can vanish by 
virtue of S and (5.3). Hence the system 2 =S+(5.3) is a primitive system 
of S. 


THEOREM 4. The minimum number of equations which adjoined to S yield 
a primitive system is an invariant of S. 


Let primitive systems for two equivalent systems S, S be 
(5.4) z=: w = 0,---,w =0,¢'=0,---, = 
(5.5) T: =0, 


respectively, where & and / are least. The derived forms of S, being linear 
homogeneous combinations of those of S, vanish whenever those of S$ do, 
and vice versa. This shows that <k and k SI, whence 

Every passive system (species zero) of r equations is the derived system 
of a system of r+1 equations. When the adjunction of a single equation to 
a system of species one furnishes a primitive system is answered by the fol- 
lowing: 


1933] INVARIANTS OF PFAFFIAN SYSTEMS 933 


THEOREM 5. Every system of r equations whose species is one is the derived 
system of 2r—r’ equations, but of no smaller number. 


A transformation (2.3) and a change of variables will put any system of 
species one in the form 


(5.6) = 0, — Avtigdyrt! =... = dxt — Atdzxtt' = 0, 


where the first r’ equations constitute the derived system. A system has (5.6) 
in its derived system if and only if it implies the vanishing of the forms 
dA«dx'*', that is, if and only if it implies 


(5.7) dA* — = 0, 


If the forms dA* were linearly dependent by virtue of (5.6), there would be 
more than r’ equations in the derived system of (5.6). Therefore equations 
(5.7) form with (5.6) an independent set of 2r—r’ equations, and no primi- 
tive system contains fewer equations. If x, A, \ constitute a set of independent 
variables, the system composed of (5.6) and (5.7) is a primitive system of 
(5.6). 


THEOREM 6. When the species exceeds one, the adjunction of a single equa- 
tion gives a primitive system if and only if the class is 2r—r' +1 and the rank 
two. 

If we put 

=G* moda!,---,w" (i=r'+1,---,7r), 
the conditions of the theorem can be restated: the class of the set G‘ is 
r—r' +1, and 
(5.8) GiG'=0 r). 
If the adjunction of 
(5.9) 
gives a primitive system, 

Gi = 
Hence (5.8) are satisfied. If the forms w, ¢, ¥ were not independent, the G’s 
would be linearly dependent and the derived system would contain more than 
r’ equations. Hence w, ¢, y are independent and the class of the system Gi‘ is 
r—r' +1. Conversely, when (5.8) are satisfied, Theorem 2 shows that S can 
be displayed as (4.1) or (4.2). The class of (4.1) is r+3, however, whereas 


the expression 2r —r’ +1 reduces to r+4 when r’ =r —3. Hence Sisin the form 
(4.2), and the ¢ of those formulas furnishes a primitive system. 


a 


934 MABEL GRIFFIN [October 


6. A theorem on matrices. A square matrix is monomial if it contains one 
and only one non-zero element on each row and on each column. The totality 
of monomial matrices of a given order has the group property under multi- 
plication. It will be called the monomial group. 

Lema. If a non-singular square matrix is multiplied by a properly chosen 
monomial matrix, every element in the main diagonal of the resulting matrix 
is different from zero. 

Let the given matrix be M. Consider any non-vanishing term of the expan- 
sion of the determinant | M|. In the matrix M replace each element of this 
term by unity and every other element of the matrix by zero; call the trans- 
pose of the resulting matrix V. Then MN has the chosen non-zero elements 
on its main diagonal. 

THEOREM 7. If the elements a} of a non-singular square matrix M satisfy 
the conditions 
(6.1) a;’) = 0, 
where the i;,--- , t, are any set of values from the range 1, 2,---, 7 such that 
at least two of them are equal and P indicates the summation of all the terms ob- 
tained by permuting the subscripts, then M is monomial. 

Consider the product MN, where N is constructed as in the preceding 
lemma. This matrix MN also satisfies conditions (6.1) because multiplica- 
tion on the right by N simply permutes the columns, thus permuting the 
subscripts in (6.1). Hence MN is a matrix satisfying (6.1) and also 
(6.2) aja? 

If MN is monomial, then M is too. Therefore it suffices to show that any 
matrix satisfying conditions (6.1) and (6.2) is monomial. 

The theorem holds for r=2 because conditions (6.1) and (6.2) are 

aja? = 0, = 0; afta? #0. 

Assume the theorem true for every matrix of order less than r and suppose 
(6.1), (6.2) satisfied by a matrix of order r. From (6.1) we have a;'a? - - -a}=0, 
whence a/a? - - - a} =0. To prove 
(6.3) aj} =~aj =---=a/ =0 
we employ induction. Suppose k—1 elements on the first row are zero. By 
‘interchanging, if necessary, certain rows and the corresponding columns, we 
make those elements a7, a3,--- , a} and preserve the condition (6.2). 


From (6.1), 


1 1 
P(aidq - Gr) = 


| 


1933] INVARIANTS OF PFAFFIAN SYSTEMS 935 


where 72, - - - , 4 are chosen from the range 2, - - - , in every possible way. 
Since a7 =a} = - - - =a? =O and a; <0, this reduces to 


1 1 


We wish to show that 
(6.4) a, =0 


holds. Assuming the contrary, we have 


(6.5) P(a;'- - = 0. 


But the conditions of the theorem are then satisfied for the range 2,---, k 
and by assumption they imply 


(6.6) af = 0 (i,j = 2,3,---, ij). 


The condition P(a? --- a*)=0, where the upper indices are permuted, is 
also implied by (6.5). Because of (6.6) it reduces to a? - - - a*# =0. This con- 
tradiction forces us to conclude that (6.4) is true. Hence we may assume 
4,4, =0, and by induction we reach the result that all elements except a? in 
the first row are zero. Since any row can be made the first by a transformation 
that preserves (6.2), the same argument can subsequently be applied to each 
of the other rows to show that 


aj =0 (i,j =1,---,r; 77), 


and the theorem therefore is true for the matrix of order r. 
7. Partition of pfaffian systems. Suppose that for a pfaffian system cer- 
tain relations 


(7.1) F, = 0,F, #0,--- 


are satisfied. Let G be the subgroup of transformations (2.3) leaving (7.1) 
invariant. If G is intransitive or imprimitive,* the transformation (2.3) 
which displays its ultimate sets of intransitivity or imprimitivity will exhibit 
S as the sum of a number of pfaffian systems 

S=S5,+S2+ + Sy. 
This will be called a partition of S. 


If Gis intransitive, each S; is an invariant system of S. If G'is imprimitive, 


* These terms are employed in the sense defined by H. F. Blichfeldt, Finite Collineation Groups, 
Chicago, 1917, p. 17 and p. 76. 
+ We apply the symbol + only to aggregates having no elements in common. 


4 
| 
| 
| 
| 
} 

| 
| 


936 MABEL GRIFFIN [October 


the systems S; all contain the same number of equations and are permuted 
when S is subjected to the general transformation (2.3).* 

If each of the systems S; contains only one equation, the partition will 
be called complete; this occurs only when the group G is monomial. 


THEOREM 8. The conditions 
(7.2) Q}2..-7 + 0, = (iste lr r) 


define a complete partition of the pfaffian system. 
By (2.4) the conditions that the second of (7.2) be preserved under (2.3) 
are 


* Ga,2 "=0 
and reduce precisely to (6.1). The result follows from Theorem 7. 
A system admitting partition is 
(7.3) dx! + = 0, dxt + = 0, dx + x®dx* = 0. 
For it the defining relations (7.1) are 
= = 0% = = 0, p, = 4. 


8. Separable systems. A system will be called separable if it is equivalent 
to a system S expressed in terms of a set of independent variables X such that 


S = Si + S2,X = Xi + Xa, 


where S; is expressed in terms of the variables Xi, S: in terms of X2, and 


neither S, nor S2 is vacuous. 
The system is completely separable if it is equivalent to 


expressed in terms of variables 
X= Xit X2t+---+X,, 
each S; containing a single equation and being expressed in terms of the 
variables in the corresponding X;. The following is readily proved: 
THEOREM 9. A completely separable system can be written in a canonical 
form each equation of which is in the canonical form for a single equation. 
Generalization of a known methodf proves 


* Invariant systems like those of §3 arise when Gis merely reducible. 
+ Cf. Goursat, p. 308, where the method is employed in reducing a system of two equations in 
four variables to canonical form. 


1933] INVARIANTS OF PFAFFIAN SYSTEMS 937 


THEOREM 10. A pfaffian system of r equations having r—1 independent 
integrals is completely separable. 


We shall now derive a necessary and sufficient condition that a system be 
completely separable. Suppose a completely separable system written in 
canonical form. Let w’‘=G‘, mod w!, - - - , w. The number of equations in 
the derived system is the number of independent solutions of 


(8.1) = 0. 
Since G!, - - - , have no differentials in common, 
(8.2) = 0,---, = 0. 
Suppose exactly g of the G’s are zero. They can be made G!, - - - , G*, and 
the first g equations of S are 
(8.3) dx'=0,---,dx7=0. 


The remaining }’s are zero and the number of independent solutions of (8.1) 
is g. Hence (8.3) is the derived system S' of S. The derived system of a com- 
pletely separable system is therefore passive. We have 


(8.4) =0,---,a¢=0, 


whereas the remaining (’s of the first order are linearly independent. 

Let the class of w‘ be 2m;+1. It is easily verified that every Q of order 
m,+m2+ --- +m, is zero except 
(8.5) (atl)... 


where the superscript 7 occurs m; times. We denote the conditions that the 
0’s other than (8.5) vanish by 


(8.6) = 0. 


Consider a transformation (2.3) which leaves the two sets of conditions 
above invariant. The preservation of (8.4) gives 


(8.7) af =0 qt1,---,7). 


The preservation of the other conditions (8.6) gives 


where a, B,--- , y have any values from the range g+1, - - -, r except the 
values occurring in (8.5). In particular, when all the a’s are equal, all the ’s 
are equal, etc., we have 


| 

= | 


938 MABEL GRIFFIN 
ma+2 


(8.9) P{ (ages) - = 0, 


where a, 8,---, y is any set from g+1, - - - , r which is not a permutation 
of g+1,---,7 and the m’s denote powers. 
The substitution 


1 


a B 
= Dest, (Gers) (ar) 


= 5, 


(8.10) 
gives 
a ,8 
Hence the theorem of §6 can be applied to show that the matrix 


and consequently the matrix ; 
(i,j =q+1,---,7), 


is monomial. 
If we write 


(8.11) S=S'+ Seti t+ +5, 


the system S" is left invariant by any transformation (2.3) preserving (8.4), 
(8.5), (8.6); and the systems 


(8.12) + Sots, S' + 5,, 


each of which contains g+1 equations, are permuted among themselves by 
such a transformation. The class values for the canonical form of systems 
(8.12) are 


(8.13) + +1, 9 + $1, + 2m, +1, 


and consequently the class values must be these whenever conditions (8.4), 
(8.5), (8.6) are satisfied. Thus we have an additional necessary condition on 
a completely separable system. 

The conditions given above are also sufficient. By Theorem 10 systems 
(8.12) can be written in canonical form. From the result already established 
concerning the derived system of a completely separable system in canonical 
form, the first g equations in each of the systems (8.12) must be equations 
(8.2). Since the class values are the set (8.13), the system is expressed in 
terms of x!,--- , and 2(m --- +m,)+r—qg other variables. Since 
the total number of these variables is the same as the degree of the @ in (8.5), 
that Q is the product of their differentials, and its non-vanishing declares the 
set of variables independent. Thus we have 


1933] INVARIANTS OF PFAFFIAN SYSTEMS 939 


THEOREM 11. A pfaffian system is completely separable into r equations of 
class 1,--+,1, 2mgy1+1,-- +, 2m,+1, where no m is zero, if and only if the 
following conditions are satisfied. It must be possible to determine a transforma- 
tion (2.3) which realizes (8.4), (8.5), (8.6). When such a transformation (2.3) 
has been found and applied to the system, the numbers (8.13) must be the class 
values of (8.12). 


The determination of the transformation (2.3) involves finding a particular 
solution of a system of homogeneous algebraic equations in the a’s whose 
degree is m,4:+ --- +m,. It is important to note that the second condition 
in Theorem 11 is either satisfied for all solutions of this system of algebraic 
equations or for none. 

A completely separable system having no integrals admits a complete 
partition. That the converse is not true is evident from (7.3). 


DUKE UNIVERSITY, 
Duruay, N. C. 


‘a 
| 
| 


THE COMPLETE EXISTENTIAL THEORY OF THE 
WHITEHEAD-HUNTINGTON SET OF POSTULATES 
FOR THE ALGEBRA OF LOGIC* 


BY 
A. H. DIAMOND 


1. Introduction. Consider any set of postulates, say, for the sake of 
concreteness, P,, P2, Ps. Any interpretation of the undefined ideas of P,, Ps, 
P; constitutes a concrete system S which either satisfies or does not satisfy 
some or all of the postulates. The system S will then have with respect to 
P,, Pe, Ps one of the 2° characters (+ + +), where a “+” sign in the ith place 
denotes that postulate P; is satisfied and a “—” sign that P; is not satisfied. 
The complete existential theory of the postulate-set Pi, Pz, P; consists in deter- 
mining for every one of the 2° characters (+ + +) whether or not there exists 
a concrete system S corresponding to that character.t The object of this 
paper is to establish the complete existential theory of the Whitehead- 
Huntington set of ten postulates for the algebra of logic, expressed in terms 
of logical addition and logical multiplication.{ This is, perhaps, the most 
“natural” and most elegant set of postulates for the Boole-Schréder algebra 
of logic. 

The complete existential theory of a set of postulates includes the solution 
of the problem of determining whether or not the postulates of the set (or of 
any of its sub-sets) are completely independent, i.e., whether or not any of the 
postulates of the set (or sub-set) or their denials can be derived from any 
of the other postulates or their denials. Thus the present discussion will show 
whether or not the postulates of the Whitehead-Huntington set (or any of 
its sub-sets) are completely independent. 

The present theory has the following distinctive characteristics. (1) The 
number of postulates involved is 10, and so requires, for the establishment 
of the complete existential theory, 2'° propositions of existence and non- 
existence. This number is far greater than the number of propositions which 
constitute any complete existential theory hitherto published.§ (2) The con- 

* Presented to the Society, March 18, 1933; received by the editors March 18, 1933. 

+ Professor E. H. Moore first proposed the problem of the complete existential theory of a set 
of postulates. See his Introduction to a Form of General Analysis, New Haven Colloquium, Yale Uni- 


versity Press, p. 82. 

t See E. V. Huntington, these Transactions, vol. 5 (1904), pp. 288-309. 

§ The largest number hitherto published is 2°=64. See B. A. Bernstein, The complete existential 
theory of Hurwits’s postulates for abelian groups and fields, Bulletin of the American Mathematical 


940 


THE WHITEHEAD-HUNTINGTON POSTULATES 941 


crete systems employed are all in the modular form devised by Professor B. A. 
Bernstein.* This form permits the listing of the large number of systems 
involved with a conciseness and a simplicity not existing in the proof-systems 
employed before the development of the modular theory in question. (3) The 
number of systems constituting the propositions of existence is 325. This 
number is far larger than any previous number of proof-systems employed 
in connection with a set of postulates. Further, the systems relate to postu- 
lates expressing laws found in many other mathematical theories. These 
systems thus provide a large store of possible proof-systems for many im- 
portant postulate-sets. (4) The systems employed are all algebras of not more 
than three elements. This adds to the value of the systems as a source of 
possible proof-systems to be used in other postulate-sets. 

I begin with the listing of the Whitehead-Huntington postulates. 

2. The Whitehead-Huntington postulates. The Whitehead-Huntington 
set of postulates for the algebra of logic leave undefined a class K, and two 
binary operations +, X, and are the ten propositions following. 

Ia. a+b is in K whenever a and bare in K. 

Ib. abis in K whenever a and db are in K. 

Ila. There is an element Z such that a+Z=<a for every element a. 

IIb. There is an element U such that aU =a for every element a. 

IIIa. a+b=6+<a whenever a, b, and are in K. 


IIIb. ab=ba whenever a, b, ab, and ba are in K. 

IVa. a+hc=(a+b)(a+c) whenever a, b, c, a+b, a+c, bc, a+bc, and 
(a+b)(a+c) are in K. 

IVb. a(b+c) =ab+ac whenever a, b, c, ab, ac, b+c, a(b+c), and ab+ac 
are in K. 


V. If the elements Z and U in postulates IIa and IIb exist and are unique, . 
then for every element a there is an element a’ such that a+a’=U and 
aa’ =Z. 

VI. There are at least two elements, x and y, in K such that +¥y. 


Society, vol. 28 (1922), p. 397. 25=32 propositions occur in two other papers. See Paul Henle, The inde- 
pendence of the postulates of logic, Bulletin of the American Mathematical Society, vol. 38 (1932), p. 
409. See also J. S. Taylor, Sheffer’s set of five postulates for Boolean algebras in terms of the operation “re- 
jection” made completely independent, Bulletin of the American Mathematical Society, vol. 26 (1920), 
p. 449. 

* See B. A. Bernstein, Modular representations of finite algebras, Proceedings of the International 
Mathematical Congress, Toronto, 1924, p. 207. See also B. A. Bernsteinand Nemo Debely, A practical 
method for the modular representation of finite operations and relations, Bulletin of the American Mathe- 
matical Society, vol. 38 (1932), p. 110. 

+ The original wording of the postulates is retained except that K replaces Huntington’s “class”, 
the circles around the operations + and X are omitted, and the original (A, \/, and 4 are replaced 
by Z, U and a’ respectively. 


942 A. H. DIAMOND [October 


If these postulates be denoted by ai, ae, - - - , aio and their denials by 
ag, then the propositions a, d2,---, dw, af, ay 
divide the universe of discourse of these propositions into 2!°=1024 com- 
partments represented by the logical products ajd2 - - - dio, Gide - + - aif, 

afag - ay.*Only 325 of these compartments are actually repre- 

sented in the universe; the remaining 699 are empty because of the relations 

of implication subsisting among the propositions a2, ---, di, af, af, 
dy. 

3. Propositions of non-existence. Of the 2" propositions constituting the 
complete existential theory of our postulates, 2!°—325 = 699 are propositions 
of non-existence. These non-existence propositions, together with reasons 
establishing them, are given by propositions A—F following. 


except 

For, if postulate VI is denied by a system S, then K has just one element 
or none. If K has no elements, S satisfies postulates IIIa, I{Ib, IVa, IVb, 
and V vacuously. If K has just one element, there are four cases to consider. 
(1) If S satisfies both postulates Ia and Ib, then S satisfies postulates IIIa, 
IIIb, IVa, IVb, and V non-vacuously. (2) If S denies both postulates Ia 
and Ib, then S satisfies postulates IIa, IIIb, IVa, IVb, and V vacuously. 
(3) If S satisfies postulate Ia and denies postulate Ib, then S satisfies postu- 
late IIIa non-vacuously and postulates IIIb, IVa, IVb, and V vacuously. 


(4) If S denies postulate Ia and satisfies postulate Ib, then S satisfies IIIb 
non-vacuously and postulates IIIa, I[Va, IVb, and V vacuously. 
Proposition A accounts for 496 characters. 


For, if postulate VI is denied, postulate IIa will be satisfied if and only 
if K has just one element and S satisfies postulate Ia. 

Proposition B accounts for 4 characters not already accounted for by 
proposition A, namely (-++++++++-). 

C. There exist no systems for the characters (+-—-+++++++-). 

For, if postulate VI is denied, postulate IIb will be satisfied if and only if 
K has just one element and S satisfies postulate Ib. 

Proposition C accounts for 3 characters not already accounted for by 
propositions A and B,(+ —-++++++-—) and (+—-+++++++4+-). 

D. There exist no systems for the characters (+++—-+++++-) and 


(++-++++++-). 


* See E. V. Huntington, these Transactions, vol. 26 (1924), p. 277. 


A. There exist no systems for the characters (t++++4++4++4+++-) 
B. There exist no systems for the characters (—-++++++++-). 
a 


1933] THE WHITEHEAD-HUNTINGTON POSTULATES 943 


For, if postulate VI is denied, there are two ways in which postulates Ia 
and Ib may both be satisfied. (1) If K has no elements, postulates Ia and Ib 
are satisfied vacuously. (2) If K has just one element, postulates Ia and Ib 
may both be satisfied non-vacuously. In case (1) both postulates IIa and IIb 
are denied. In case (2) both postulates IIa and IIb are satisfied. 

Proposition D accounts for 2 characters not already accounted for by 
propositions A, B, and C, namely (+++—+++++-) and (++-—-++4+ 
++++-). 

E. There exist no systems for the characters (+—-—-—-+++++-) and 
(-+—--t++++-). 

For, if postulate VI is denied, there are two ways in which postulates Ila 
and IIb may both be denied. (1) If K has no elements, postulates IIa and IIb 
are denied non-vacuously and postulates Ia and Ib are satisfied vacuously. 
(2) If K has just one element, postulates IIa and IIb are both denied if and 
only if postulates Ia and Ib are both denied. 

Proposition E accounts for 2 characters not already accounted for by 
propositions A,B,C, and D, namely (+— ——-+++++-) and (-+-— 
+++++-). 

F. There exist no systems for the characters (+++—-++++-+), 
and 

For, if postulate IIa or IIb or both are denied, then postulate V is satisfied 
vacuously. 

Proposition F accounts for 192 characters not already accounted for by 
propositions A, B, C, D, and E, namely (+++—-++++-—+4), (++ 
—+++++-—+) and 

4. Propositions of existence. The 325 propositions of existence for our 
postulates are given by the tables A and B below. In these tables all the sys- 
tems are arithmetic systems, the elements being the numbers 0, 1, and 2. 
The notations used are those employed by Professor Bernstein in the second 
of the papers cited in the first footnote on page 941, except that, for the sake 
of saving space, certain abbreviations are resorted to. Let f(a, b) be any poly- 
nomial expression in a and 6, where a and 6 are any of the numbers 0, 1,---, 
p—1. Then (f(a, 5)), will denote the least positive residue modulo p obtained 
from f(a, 6) by rejecting multiples of p. The operations + and X are to be 
interpreted as the operations of ordinary arithmetic when they occur in the 
modular expressions, otherwise they are to be interpreted as logical addition 
and logical multiplication. Thus for a=1, b=1, we have (2?+ab+6+2);=2. 
[a, b; m, n], will denote a function (f(a, b)), such that (f(a, 6)),=0 or 1 ac- 
cording as the equalities a=m, b=mn do or do not both hold. The results 


| 
| 


944 A. H. DIAMOND [October 


obtained by Professor Bernstein enable us to write down the expression for 
[2, b; m, n],. Indeed, [a, b; m, 
Thus [a, 6; 1, 2),=(1—{1—(a—1)?} 
+ab+1)s. 

For the sake of simplicity, the 325 systems will be divided into two groups, 
A and B, according as K does not have or does have more than one element. 
Table A gives systems for which K does not have more than one element; 
table B gives systems for which K does have more than one element. To 
save space in table B, instead of writing out the elements of K as in table A, 
the subscript p in (f(a, b)), will indicate that the elements of K are0,1,---, 
p—1. In both tables, the characters will be written without parentheses. 

The duality of the postulates with respect to the operations + and X 
makes it possible to reduce the number of systems in table B from 320 to 
172. Every system S has a dual system obtained from S by interchanging 
the definitions of + and xX. The character corresponding to the dual of a 
system S is determined as follows. Consider the character C corresponding 
to S. C consists of “++” and “—” signs arranged in a certain order. The first 
8 signs occur in pairs corresponding to dual postulates. Interchange in C 
the signs in each pair. The character thus obtained is the character corre- 
sponding to the dual of S. 

TABLE A 


Character 


—---+++++- 


TABLE B 


x Character a+b ab 
Ia Ib Ila IIIa IlIbIValIVb V VI 


(a+1)2+0/[a, b; 1, 1], (ab+b)2+0/[a, 0, 
a+b of system 1 5; 0, 0], 
(b)2+0/[a, 5; 1, a+b of this system 
(b+1)2+0/[a, b; 1, 1], 0/[a, 5; 1, 1] 
a+b of system 1 (ab+1)2+0/[a, b; 0, Os 
ab of system 4 


“ “ 3 «@ “ 


0/[a, b; 1, 1+0/[a, b; 1, 1] 


+++ +4444 


No. a K a+b ab 
(i) 0 0/0 0/0 
(ii) 0 0/0 0 
(iii) 0 0 0/0 
(iv) Null 
(v) 0 0 0 
1------- + 
2------- + 
$3------+ + 
+ 
| §-----+- + 
6-----++ + 


THE WHITEHEAD-HUNTINGTON POSTULATES 


TABLE B (Continued) 


Character 
Tb Ila {Ib IIIa IIIb IVaIVb V 


a 


a+b 


ab 


| 


+++ 


++ 


| 


titi titi 


Li 


++ + +1 
itt i 41+ 


t+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 


++ + +441 


ab of system 4 
(0/0)2 

(c+1):+0/[a, b; 2,2]s 
5; 2,2]s 

a+b of system 1 

“ « “ 3 

(6+1)s+0/[a, 0, O]s 
a+b of system 1 


“ 


be “ “ 


1+0/[a, b; 2, 2]s 
(2a-+26),+0/[a, 2, 2], 

ab of system 8 

4 
(a+b)2+0/[a, 5; 0, 
(6+1)2+0/|a, 5; 1, 

ab of system 8 

“ “ “ 4 

“ “ 11 
(ab+a)s+0/[a, b; 1, 1]s 


b; 2,2] 


(a)s+0/[a, b; 2, 

ab of system 13 

“ 
(ab+a)s+0/[a, b; 1, 2]s 
(ab*)s+0/|a, b; 2, O]s 
(a)2+0/ [a, b; 0, 0}: 

of system 30 


ab of system 8 
(0/0)s 
(ab+a)s+0/[a, b; 2,2]s 
(2a*b-+-a+b)s+0/|[a, 6; 2,2]s 

(a)2+0/ [a, b; 1, 1}, 
ab of system 13 

(ab)s+0/ [a, b; 0, O]s 

(a+b)2+0/ {a, 6; 0, 1}; 

(a).+0/(a, b; 1, 0}. 

(a+b).+0/|[a, 5; 1, 
ab of system 11 


(ab), +0/[a, b; 1, 
(ab)2+0/ [2, b; 0, 0): 
ab of system 11 
a+b of this system 
ab of system 12 
“« 11 
a+b “ 3 
(ab)s+0/[a, b; 1, 2]s 
(a+6)s+0/ [a, b; 0, 2]s 
ab of system 16 
(a+b)s+0/[a, b; 2, 


1,1], 5; 1, 1]s 


(ab*);+-0/[a, b; 0, 
ab of system 13 
a+b “ 3 
ab “ “ 16 
39 
5; 


ab of system 39 
“ 18 


a+b “ 
« 
ab “ 
a+b 
“ “ 
« 


(2a*)+0/[a, b; 0, 


(a,b; 2,2], 


(a)2+0/[a, b; 0, 1] 
(2a%+a+b);+0/[a, 5; 1, 2]s 
ab of system 18 
«@ « 16 
(a*b+ab?+a+b)s 
+0/[a, 6; 1, 1]s 


ab of system 39 
26 
(ab+-b)s 
(6+1)s 
(a+1): 
(b)s 
(0)s 
(ab+1)s 
(a*-+5*)s 


1933] 945 
= 
Ia 
10- 
— 
12- 
— — 
1s — — — 
16 — 
148 —- — — 
— — 
oo 16 
18 
26 — — 
27 — — ----- 
2 — --- 
29 --- q 
30 — — --- 
31 — --+ 
33 — -+- 
a 
35 — | 
q 
37 | 
3 
39 — — ‘| 
a-- q 
42 — — 
43 — — +- 
44 +- 
46 — + + 
479-+----- 
o-+----+ | 
so-+----+ a 
s2-+—---+- 
3-+---++ 
| 


A. H. DIAMOND 


TABLE B (Continued) 


Character a+b 
Ila Ilb Ila llIbIVaIVb V 


- 


a+b of system 
ab be “ 
“ 


3 ab of system 51 
4 @ 
“ 
5 
4 
8 
4 


48 


a+b 


ab ab of system 59 
(a+b+1)2+0/[a, 5; 0, 1], (ab+a+b+1)2 
ab of system 10 ab of system 51 
(b+1)2+0/[a, 5; 0, (ab+6+1)s 
a+b of system 63 (ab+<a)2 
ab of system 65 
(ab+a+5)2 
(a+b): 
(ab)2 
ab of system 68 


1 


| | 

Bw = 


(2a*b+a-+-b)s 
ab of system 65 


68 

“ 

69 

(a+ 1)3 

ab of system 49 
(2a*+b+1)s 

ab of system 50 

(1)s 
ab of system 59 
(2a+2b)s 
(a)s+0/[a, b; 0, O]s ab of system 85 
(a+b+1).+0/[a, 5; 1, «47 
of system 17 


“ 


ab of system 95 
“« “ 64 
&@ “ 95 

(2a*b?-+-ab+-a)s 


+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 


44441 


e & 2 


i + 


946 [October 
Ta 
55 — 
56 — 
57 — 
58 — 
59 — 
— a= 
61 — 
62 — 
om 
6 — 
= 
67 — 
68 — 
69 
70 — 
71 — 
72 — 
75 — 
76 — 
77 
78 — 
79 — 
80 — 
81 — 
82 — 
83 — 
85 — 
86 — 
87 — 
88 — 
89 “ “ 16 “« «@ “ 
“ 16 “ “« 52 
92 — “ 25 « 59 
93 “« 18 “ “ 61 
94 — « 26 51 
95 — 11 (ab+-a)s 
9 — a+b 28 
97 — ab 13 
98 — a+b 30 
99 — ab 12 
} 


THE WHITEHEAD-HUNTINGTON POSTULATES 


TABLE A (Continued) 


Character a+b 
IIa Ifb IIIa IIb IVaIVb V VI 


of system 11 (a)s 
ab of system 72 
“« “ 65 
(ab)s 
(a+b)3 
ab of system 68 
“« “ 104 
(ab+2a+2b+-2)3 
ab of system 69 
(2a*b?+-2a*b+-2ab?+- ab); 
ab of system 63 
64 
« 
(ab*)s 
ab of system 63 
“« 65 
& « 63 
65 
(a+b+ 1)2 
ab of system 68 


&@ “ “ 


+ 
| 


& 


L+++4+1 


| 


“ 
“ 
“ 
“ 
“ 
“ 
“ 
“ 
“ 
“ 


2 


be 


* (a*b+ab?-+a+b); 
(ab); +0/[a, b; 2, 2], ab of system 104 
of system 18 
69 
78 
49 


(2a*)s 
of system 131 


4 
3 


+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 


FRR RRR RR RE, 

® 

FRR RRR Re 


++++4+4+4441 


1933] eee 947 
Ia 

100 — _ ab 

101 — 

102 — 

103 — b 

104 — 

105 — 

106 — b 

107 — 

108 — 

109 — 

110 — b ; 

— 

112 — 

113 

114 — i 

115 — 

1146 — 

117 

118 — 

119 — 

120 — 

121 a 39 

122 — 

123 — if 

124 — 

125 — 

126 

127 

129 50 

130 51 

131 

132 53 

133 51 | 

134 59 | 

135 131 

136 51 

137 64 

138 72 it 

139 65 4 

140 “ 

141 69 | 

142 68 i 

143 69 | 

144 

145 64 | 


A. H. DIAMOND 


TABLE B (Continued) 


Character a+b ab 
Ib Ila IIb IIIb IVa IVb V 


of system 85 ab of system 72 

“« 100 
“« 69 
(2ab+a+5)s 
ab of system 
“« «@ 


hi 
++++4+4+4 


j++ 


“ 
“ 
“ 
“ 
“ 
“ 
“ 
“ 


“ 


(2a*b+-ab-+-2a?+ 2a); 
of system 114 


“ 


103 

“ 108 

(a*b?+ 20*b+ 2ab?+-2ab)s3 
ab of system 69 


& 


| 


titi ti 


68 
103 
69 


+ 
+ 
+ 
+ 


+++4+441 
+++ 


5. Remarks on the systems. In the proof-systems denying postulate Ia, 
the elements a and b for which a+6 does not exist are respectively the values 
of m and m which occur in the symbol [a, b; m, n],. Similarly, for postulate Ib. 

The verifications will often be more easily effected if tables are constructed 
corresponding to the modular expressions in question. 


UNIVERSITY OF CALIFORNIA, 
BERKELEY, CALIF. 


948 
\ 
146 = 
147 
148 
149 
150 
151 
152 
153 64 
154 a 
155 99 
156 65 
157 72 
158 65 
159 63 
160 64 
161 
162 65 
163 
164 
165 
166 oe 6 65 
167 “ “ “ 119 eee 
168 “ “ “ 68 
170 « 6 
171 “ “ “ 69 “« «@ “ a 
172 “ “ “ 67 “« « “ “ 
| 
} 


CYCLIC FIELDS OF DEGREE EIGHT* 


BY 
A. ADRIAN ALBERT 


1. Introduction. Let F be any non-modular field, C be an algebraic ex- 
tension of degree n of F. Then C=F(zx) is the field of all rational functions 
with coefficients in F of a root x of an equation ¢(w) =0 which has coefficients 
in F, degree n, and transitive group G for F. 

The problem of the construction of all equations of degree m and group G 
is evidently equivalent to the problem of the construction of all correspond- 
ing fields C. Moreover the construction of a set of canonical equations ¥(w) 
=0 with the property that every C=F(x) of degree m and group G is equal 
to an F(y) defined by a ¥(w) =0 provides a solution of both problems. 

One of the most important problems in the algebraic theory of fields is 
the construction of all cyclic fields of degree m over F. This is the case where 
G consists of the m distinct powers S‘ (i=0, 1, - - - , m—1) of a single substi- 
tution S. In this case G is also the group of all automorphisms of C. Moreover 
this problem has been reduced to the case n= p*, p a prime. 

Cyclic fields of degree 2, 2? have been constructed. In the present paper 
we shall use purely algebraic methods to construct all cyclic fields of degree 
2? = 8f over any non-modular field F. 

2. General theory of cyclic fields. Let F be any non-modular field and let 
C be a cyclic field of degree m over F. Then if 


(1) n= po® eee pi*, 
where the /; are distinct primes, it is well known that C is the direct product 
(2) CH 


of cyclic fields C‘ of degree p;** over F. Conversely every direct product (2) 
is a cyclic field of degree m over F. It is thus certain that the problem of con- 
structing all cyclic fields of degree nm over F is equivalent to the corresponding 
problem for the case n= p*. 


* Presented to the Society, February 25, 1933; received by the editors December 19, 1932. 

Cf. §2. 

} Cyclic fields of degree eight have been considered by F. Mertens in the Wiener Si 
berichte, vol. 125 (1916), pp. 741-831. But he considered algebraic number fields, used the arithmetic 
theory of ideals, and did not give very explicit results. His method is not at all applicable to the case 
we are considering (where F is a general field). Moreover I believe the results obtained here are 
more explicit and give a more definite construction for C even for the cases considered by Mertens. 


949 


i 
i 
+ 
| 
q 
| 
H 
i 
q 
He 
4H 


950 A. A. ALBERT [October 


Let then C=C, have degree n=p*, p a prime. It is well known that we 
may define a chain of fields 


(3) =F, 


where C; is cyclic of degree p‘ over F, cyclic of degree p over C;_:. In fact let 
S be the automorphism of C generating its group G of automorphisms. Then 
this group of order 1 is given by 


(4) G. = (I, S, = 1. 
But if T then 
(5) H = (I, T, T?,---, T?~) 


is an invariant sub-group of G of index p* defining a sub-field C._: of degree 
p* and with Galois group 


(6) = (I, ¢, 07, ---, 0"), m = 


isomorphic with G, but with T=S™ in G corresponding to the identity of 

We may now consider every C, as a cyclic field of degree p over a cyclic 
field C._; of degree p*-! to obtain some of the properties of C.. But if C, is 
cyclic of degree p over C._; which is cyclic of degree p*-' over F, then it is 
not necessarily true that C, is cyclic over F. Thus we shall also require a 
consideration of further properties. 

We are interested here only in the case p=2. Let C be a cyclic field of 
degree n =2°*, e>1 over F, and let D be its uniquely defined sub-field of degree 
m=2*-'. Then C is a quadratic field over D, 


(7) C = D(x), x? = ain D, 


where 1, x are linearly independent with respect to D. The substitution S 
generating the cyclic group of C has order m and D consists of all quantities 
d of C such that 

(8) ds" =d (m = 2°-1), 


For convenience of notation we shall write 
(9) = 
so that c‘™ =c whenever c is in D but not otherwise. Then 
= = @ = x?, 
and «‘) = +x. But xis not in D. Hence 


(10) = = 


| 
1 
| 


1933] CYCLIC FIELDS OF DEGREE EIGHT 951 


In particular let x’=a+fx where a and @ are in D. Then (x’)?=a'=a? 
+6?a+2a8x. But a’ is in D so that 208 =0. If 8=0 then x’ =a is in D and 
(x’)*-) = x =a"—» js in D, a contradiction. Hence 8 ¥0, a=0 and 
(11) x! = Bx, B in D. 


It is obvious that D(x) =D(bx) for every non-zero b of D. Hence all of 
the above properties as well as those we may derive later will hold for any bx 
taken as the quantity generating C, a quadratic field over D. 

We shall assume first that »=2. Then D=F and, since x is not in D, the 
field F(x) = D(x) =C is a quadratic field over F generated by x. Let next 


m = 2°} = 2g, g21, 
so that n24. Then D=K(y), y?=a in K, is a quadratic field over the field K 
of all quantities k of C such that 
k@ =k, 


The field F(x) is a sub-field of C= D(x). But x is not in D2 F(x”) =F(a) so 
that the degree of F(x) is 2k where h is the degree of F(x”). Hence F(x) =C 
if and only if F(a) =D. 

Suppose that F(a)<D. Then a is in a proper sub-field of D. But D is 
cy“lic and its maximal proper sub-field K contains every proper sub-field of 
D. Hence ais in K, a =a, [x@]?=a@ =x?. Then x7 = +2, = ]@ =x, 
a contradiction. Hence F(a) =D and we have 

THEOREM 1. Let C be a cyclic field of degree n =2m over F, C= D(x) where D 
is a cyclic sub-field of C of degree m=2*—" over F so that x may be chosen so that 


x? = ainD. 


Then x'=B8-x where B is in D and has the property that x‘ = —x. Moreover 
this latter property implies that F(x) =C, F(a) =D. 


Suppose that =bx, b~0 in D. Then xo? =6b?x? =67a is in D, 
= —bx = By Theorem 1, C=F(x) = D(x) = D(a) =F (bz). 


THEOREM 2. Let x)=bx where b<0 is in D. Then F(x) =F (2) =C. 


The condition x’ =8-x imposes two restrictions on f. The first is obviously 
ax’? = = (x?)’ =a’ =6?-a, a necessary and sufficient condition that x’ shall 
actually equal 8-x. Next we must have «‘” = —x. But 


x!’ = (Bx)’ = B’Bx,---, x) = |’ - x, 


and 


xm) = .. B/B]x = — x, 


i 
4 
qf 
| 
j 
{ 
| 


952 A. A. ALBERT 


so if we write 

No(8) = -- - 
then it follows from x‘™ = —x that 
(12) Np(8) = — 1. 


Conversely let D be cyclic of degree m=2¢—' over F and let a, 6 satisfy 
(12). Then the field D(x) defined by a root x of x? =a is a quadratic field over 
D if and only if a is not the square of any quantity of D. But if a=c’, cin D, 
then 6?=(a’)(a)—! =(c’c)? so that 


(13) B= + (c)(c)“. 


But m is even and , 
(14) No(8) = (+ =(+1)"=1, 


a contradiction of the first equation of (12). Hence D(x) has degree n =2m. 
Also if we define x’ =8-x then (12) implies that x“ = — x so that we have 
defined a self correspondence of C = D(x) 


(15) +d Bx, 


for every c and d of D, c+-dx of C. This correspondence is evidently preserved 
under addition, subtraction, multiplication and division and is an automor- 
phism of C if and only if x’?=a’ which is satisfied by (12). Hence (15) is an 
automorphism S of C and, since S” is an automorphism of C in which x 
corresponds to —x the order of S is m=2m and C is a cyclic field. Obviously 
D is the set of all quantities of C unaltered by S". By Theorem 1, C=F(zx) 
and we have proved 


THEOREM 3. Let D be cyclic of degree 2°" over F with generating automor- 
phism 


d-—d', 


for every d of D. Then D is the unique sub-field of degree m of a cyclic field C of 
degree n =2m if and only if there exist quantities B~0, a0 in D, such that 


(16) = Np(6) = — 1. 


Moreover every solution of (16) defines a cyclic field of degree n over F 
(17) C = F(x), x? = ain D, = B-x, 


as generating automorphism, so that D is the set of all quantities d of C such that 
d\™ =d, 


[October 
4 
| 
| 


_ 1933] CYCLIC FIELDS OF DEGREE EIGHT 953 


The case m=1, n=2 is trivial so that we shall assume henceforth that 
n>2,n=4g=2m. Then (16) implies that 
” (i) (9) 


(18) 6? = = » GOD)! = » (BOY)? 


and hence that 


(9) 


(19) = — 


Then if 

(20) y = afp’--- 

equation (19) implies 

(21) y? = aa, 

But D is a quadratic field K(d), d? in K, over a cyclic field K of degree g over 
F. Moreover 

(22) Ro) = k 

for every k of K. Since m=2g, a‘ =a, we have [aa] =aa so that y’ 
is in K. Also yy - - Be-» ][B@- - - |] =aa Np(8) = —aa™ 


=—y’, yo = —y, and y is not in K. But then y generates D, a quadratic field 
over K, and 


(23) D = K(y), v2 =ainK. 
The field D=K(y) is a quadratic field over K which is cyclic of degree g 


over F. By Theorem 3 there exist quantities a=y? in K, y=y’y~' in K, such 
that 


(24) Nx(y) = = = 1, 
and, by this same theorem, D=F(y). Hence 
(25) C = F(x), x? = ain D, D = F(y), y? = ain K, 


(26) a’ = Bx, = yy, a= 


We now wish 


27 = 
( ) B eee Bo 


| 
| 
y y Bo 


954 


But (27) is equivalent to 
(28) BB, 


that is, 

Conversely, let y?=a in K, y?=a’a-!, N,¢y) = —1, so that, by Theorem 
3, K(y) =F(y) is a cyclic field of degree 2g over F. Let also a be defined by 
the third equation of (26), and 8 be in D and satisfy 


(29) = BBO. 
Then 
Nn(8) = - - - - - - = - - 
Also 
- Bw y 


a’ 
a 


as desired. We are now in a position to prove 


THEOREM 4. Let n=4g=2¢ and let K be a cyclic field of degree g over F with 
automorphism ke—k’ for every k of K. Then K is the unique sub-field of degree 
g of a cyclic field C of degree n over F if and only if there exist quantities aX0, 
0, Bi, in K satisfying 


(30) » Nr(y) = — 1,7 = B? — Bea. 


Every solution of (30) defines a cyclic field F(x) =C with 


(31) 


and with generating automorphism S given by 

(32) c= (cs =cl + (cd + cd vy)Bx, 
for every C1, C2, Cs, Cx of K, ¢ of C, so that 

(33) = x, = yy. 


We obviously also will have 


= A. A. ALBERT [October 
y 
a’ 
a 
y 


1933] CYCLIC FIELDS OF DEGREE EIGHT 955 


CoroLiary 1. In Theorem 4 the field K is the field of all quantities of C 
unaltered by S*, the field D=F(y) =K(y) is the field of all quantities of C un- 
altered by S™. 

For we need only notice in the above that, since 6 is in D, B=6,+(ry 
where f; and #; are in K. We have then merely replaced the condition y =68™ 
by the equivalent condition y =8,? —2’a of (30). 

We shall now obtain some important restrictions which it is possible to 
impose on 8. Suppose first that n=4, m=2. Then K=F, N,(y) =y=-—1 is 
in F, and (30) becomes merely = —1. If 6:=0 then 
which is impossible if D is a quadratic field over F. Hence for this case 8, 0. 

There exists the possibility in the above theorem that 8,8.=0. We shall 
be able to restrict 8 so that all fields C are obtained yet 616.0. 

By Theorem 2 if dey, 6; and in K, then = bx also gen- 
erates F(x) and satisfies 
(34) xo = do = = » = Boxo, = ao, = Yoyo; 


BoBo eee 


with 


(35) = Nx(vo) = — 1, Bo = Bio + Bio — = Yo. 


0 
But 
(36) = (bx)! = b/Bx = = Bobx, 


lid by + bi vy 
= = = (b/ + 
Bo B = (bi 2 — bey)e 


37 
= — bg + ybi — by be) y]e, 
where 

(38) e = (b? — 


is either in K or a multiple of y by a quantity of K according as 8.=0 or 
6: =0. But then 610820 =0 if and only if 
(39) yb: — by be = O, or bf bi — bf = 0. 
Suppose first that =0. Then since b,5.~0, 
bib, bg 


and 


i 
| 
j 
al 


956 A. A. ALBERT 


bi be by de 


41) —1= Nx(y) = 


since 6: =), and are in K, a contradiction. Hence 
We have then proved that if then =b.bo’ay. If n=4 then 
m=2 and we have already shown that 6,~0. Hence 8i0+0 and hence 
B2=B2o=0. But the coefficient of y in By when e+0 is in F=K, that is, 
Be=0, is Boo=d(be"yb; —b,'b,) #0 as we have shown. It remains only to con- 
sider the case n>4, m> 2, =aybob.’. 
Let Then F(yo) =F(y), 


(42) yoys = baby (bib{ = 1, yo = (yo), ye’ = yo. 
But the automorphism S of D =F (yo) replacing yo by yo’ has order m. Hence 
m=2, a contradiction. We have proved 


THEOREM 5. Every cyclic field F(x) of degree n=2¢ over F with K as cyclic 
sub-field is generated by an x of Theorem 4 with B,B.+0 in (30). 


3. Cyclic quartic fields. Let »=4 so that K=F, g=1, y and a are in F. 
Then V,(y) =y=—1, = —1 for in F. Put and 
obtain —e?=1—(6.6,-')’a, whence if w=f,ey then F(u)=F(y) and 
= = 


(43) w@=1+e=r7inF, B= 
€ 


since = =1+4. Also 
Brey 1—4u 
B(l+u) +u)1l—u 


where v = (—f2e”)~'0 is in F. We have therefore proved the well known re- 
sult 


44) 
( x a B 


= v(u — 7), 


THEOREM 6. Every cyclic field F(x) of degree four over F is generated by a 
quantity x satisfying 


(45) 2 = 
€ 
where « and v0 are in F and r is not the square of any quantity of F. 


4. Cyclic fields of degree eight. Let now »=8, m=4, g=2. Then F(y) 
is a cyclic quartic field. We wish 8=$,+(2y with y?=a in K, y, 81, Be in K 
and yy’ = —1, 


(46) B? — Bea = ¥, Bibs O. 


[October 
— 

i] 


CYCLIC FIELDS OF DEGREE EIGHT 


(47) 5 = By = Bea + Biy = 1 + Yo, 
where 

(48) F(yo) = F(y), yo = B1y, 61 = Box in K. 
Then 

(49) B = yd = (Biye")6, (Bex)? — Bea = — ay, 
so that, since 

(50) = a = Bra, yo = Yoyo, 

we have 

(51) B = = + 5190). 
Also 68’B’’B’’’ = —1 and hence 


66’ ” 
(52) a= = — ) 
8B" yy’ 


ay 


(68’)”” Bi 
Yo, 


By (8:8; ) 


— aoYo 


where 
ag 
(53) = — 1, = —> 62 — ap = — A = BiB! . 
ao 
Suppose that is in F. Then yoyo’ = —1 gives yo? = —1, i =(—1)"?. 
Also ao’ = —ao and if K =F(u) we may take a)=u. Then the solution of (46) 
is equivalent to 6;7—ao= —A~'avyo where J is in F and hence to the solution 
of 6? =u(1 But if this implies that in PF, 
+8:27 =0 and r= a contra- 
diction of our hypothesis that F(u) is a quadratic field. 
Hence 7 is not in F and the hypothesis 8. >£0 of §3 is satisfied for F(yo). 
But then 
yo 1 + 


(54) ye = ao = — 7), ,W=r=1+e, 
Yo € 


Also =ve—u(r —1); that is, since r—1 =’, 


(55) — = ven. 


1933] 957 
Let 
| 
| 


958 A. A. ALBERT 


We may now complete our computation (52) of a. We use 
= [(51 + + yd) = (81 — yo) (5% — Yoyo) Yo 
= (— ao + 5190) (61 — Yoyo) = — + + (615% + avo) yo. 
Hence 
Bi u 
(56) = [veu5{ — v(u — + — ven) yo], 


where 
(57) 5: = + = — Bi = Es + 


Also (51) gives 8 =Bi(veu)—! (vew— and 
hence 
58) = & E + 


veT 


We have proved 
THEOREM 7. Every cyclic field F(x) of degree eight over F is generated by a 
quantity x satisfying 
(59) x? = a, x’ = Bx, 
with a and B given by (54), (56), (57), (58) such that v0 in F, 6:0, 8,0, 
and if 
(60) d= — 
then 
(61) 6 = ap — A“agyo = v(u — + 
The quantity 6,? = £:?+ £?r+2£,£u, so that (61) is equivalent to 
(62) vr = £2 + = v(1 + 
But then —2é,f7 = (—vr)(1+A~'e) = (14+A—"e) + &?7), so that, since 
v0, equation (61) is equivalent to 
— 
+ 
The first equation of (63) will be taken to determine v. The second equa- 
tion becomes 


(63) y= (— 7) + #0, = 


+ t?r 


+ r + 
+ &?r 


|-@ — §?7) 


(64) 


[October 
\ 


1933] CYCLIC FIELDS OF DEGREE EIGHT 959 


to be solved for §£?+é&?7+0. But £2? = £7)? + £2 7(1 —7) 
= (£:+ £7)? Hence if 
+ &4u)(E1 + Eor + fer) 

(65) k=m+12u = 
where 7 and 7 are then explicitly determined in terms of &), &, &, &, then 

(&? + &?7)? + 
so that, since kk’ =n? —n?r, 
(67) —e= (&? + &7)(n? — 0, 


where we use r=1+€?+1. 
Conversely let «+0 satisfy (67) and define k by k=m+7n2u. Define 


(2? + &?r)k 
where fh; exists since £? +£&?70 and hence r~0. Then we have 
BiBirr’ — EFr)(E? + + 2Erter) 


and (64) will be satisfied. Moreover if we define v by (63); then (61) will be 
satisfied. Also r=1+¢ must not be the square of any quantity of F if F(x), 
u*=r, is a quadratic field over F as we are supposing. We have proved 


(66) kk’ = 


= £3 + = & + bor + 


(69) —e= + 7) = 


THEOREM 8. The solution of (61) is equivalent to the determination of v by 
(70) v= (— 7) + &7), 
and the solution of 
(71) —¢€= (n? — + 


for €, m, 12, &:, & in F and such that r =1+-€? is not the square of any quantity 
of F. 


5. The formulas for ao, Yo, a, 8. We have seen how every cyclic field F(x) 
of degree eight over F is generated by a quantity x such that x?=a, x’ =Bx 
where a and B are given by (56), (58), (54), (57) as soon as », e, r=1+ &, 
Bi =&+£&u, 6:=£:+£« have been determined to satisfy (61). We have also 
shown that the solution of (61) is equivalent to (70) and the solution of the 
equation (71) with variables in F. Hence we have merely to solve (71), 
obtaining formulas with parameters for €, m, 2, £1, £, obtain formulas for 


| 
{ 

if! 

id 

| 

i 

i 


960 A. A. ALBERT [October 


£; and &, by the use of (68), and by the substitution of values so obtained in 
(54), (56), (57), (58) obtain explicit a, 8, ao, yo. But the formulas so obtained 
would be undesirable because of complexity. Hence we shall confine our fur- 
ther work to a consideration of the only remaining non-trivial part of our 
problem, the solution of (71). Explicit fields of degree eight may then be ob- 
tained by carrying out the above work of substitution for every special case. 

6. The case i in F. Suppose that F contains a quantity 7 such that 
i?= —1. Then if r=1+?, in F, we wish to solve —¢=(£? +£217)(n? —n?7) 
for £1, &, 1, m2 in F and 7 not the square of any quantity of F. Let 


(72) ky = + ke = m + nem, 

so that k; is in F(u), w2=1+ €?, ke is in F(u). Then if 

(73) he = = pus, 

we have 

(74) 

since if ks’=A—wu then =hiki = [£2 [n? —n?7] =(E? 
(n? —n?7) = —e. 


Conversely let \, » be a solution of (74). Then if ks is defined by (73) 
we have 


A+ (Am — um — 


: ks 
ky = (i + f:iu) = — = 
ke 


m + nou ne — — 


so that 


Am — (um — ders) i), 


(75) = 


where 7 and 72 not both zero range independently over all quantities of F 
so that n? —n?7+0. We have therefore 


THEOREM 9. Let i be in F, i?=—1, and X, pw, € range over all solutions of 
(76) — = 


in F such that 1+ ¢?=r is not the square of any quantity of F. Then every cyclic 
field of degree eight over F is given by (70), (68), (65), (59), (54), (56), (57), 
(58) for every m, n2 not both zero and in F. 

We therefore have only to solve (76). Suppose first that 4=0. Then 
and we have proved 


i 
| 


1933] CYCLIC FIELDS OF DEGREE EIGHT 961 


THEOREM 10. Let \ range over all quantities of F such that 1+ is not the 
square of any quantity of F. Then (76) is satisfied by 1» =0, «= —X*, and defines 
corresponding cyclic fields. 


Next let u+0. Define 
(77) 
so that 
(78) 


uo! = 20, = p, 


(79) (€ — 207)? — p? = 4e4 — 1, 


and 


(€ — 20? — p)(e — 207 + p) = 404 — 1. 


Here again we must separate our work into two special cases. 

Suppose first that e—20?—p=0. Then 4o4=1, (207)? =1, so that 207 = +1. 
Moreover if 207=1 then 20 =y-!= +2"? so that, since \=py, we 
have p=e—20?=e-—1, 


(80) 


and ¢ ranges over all quantities of F such that 1+? is not the square of any 
quantity of F. Moreover if 207=—1 then p=e—20?=e+1, 
A=pn, 


We have therefore proved 


THEOREM 11. Let € range over all quantities of F such that 1+? is not the 
square of any quantity of F. Then if i is in F, i7=—1, and X, p are given by 
either (80) or (81), so that 2"/? is in F, the condition \* —p*r = —e is satisfied, 
and Theorem 9 defines a set of corresponding cyclic fields of degree eight over F. 


Suppose finally that Then and 
2(¢—20”) while 2p = (404—1)-'—7. Also 


mr + 207)? — 1 4o4 — r? — 1 
e= ( ) r => (2¢)-"', 
4or 


(82) 


and we have proved 


= | 

| 

i 

21/2 21/ 


962 A. A. ALBERT [October 


THEOREM 12. Let F contain a quantity i such that i? = —1. Then every cyclic 
field of degree eight over F is given by Theorem 9 with d, uw, € determined by either 
Theorem 10 or 11 or by (82) as r¥0 in F, 0 ¥0 in F range over all quantities of 
F such that r =1+- €* is not the square of any quantity of F. 


7. The case r=—?, ¢ in F. Let r=—# where ¢ is in F. Then F 
contains no quantity 7 such that i?= —1 since otherwise 7 = (it)? contrary to 
the fundamental assumption of our work, namely that F(u), u?=7, shall be 
a quadratic field over F. We wish to solve 


(83) —e= [2 — — 


that is, since —n?7 +0, 


ne — net 


Since «#0 we evidently have £,—&t=2+0. Then £,:+£¢=2R- so that 


(85) = é R = ———» 
— 


and we have proved 


THEOREM 13. Let € and t range over all quantities of F such that -1=e+?° 
and 1+ ¢?=r is not the square of any quantity of F. Then i is not in F, 1? = —1, 


and every cyclic field of degree eight over F is given by (68), (65), (59), (54), (56), 
(57), (58), (85) when n; and nz not both zero, +0 range independently over all 
quantities of F. 


8. The case r+¥ —/, i not in F. Let —1 be not the square of any quantity 
of F and let K =F(i), i?=—1, so that F(z) is a quadratic field over K. Our 
only remaining case is the case —7#¥? for any ¢ of F. This is sufficient to 
secure the fact that K(u), u?=7, is a quadratic field over K,* that is, F(é, u) 
is a quartic field over F. 

For otherwise let =2”, where 2; and are in F. Then r =2? —2? 
+ so that 2:2. =0. But in F, by hypothesis. Hence 2#2;, 22~0 and 
z,=0. Then 7 = (z:¢)? = —z? contrary to hypothesis. We have therefore proved 
that 7 is not the square of any quantity of K, K() is a quadratic field over K. 
We shall now prove 


Lema. Let and be in K=F(i) so that we may write 
+ pet with re, Mi, Me in F. Let 


* This is of course not the field K of preceding sections. 


| 
| 
| 


1933] CYCLIC FIELDS OF DEGREE EIGHT 


where «#0 is in F, r=1+€?. Then 
(87) Aide = 
and there exist quantities m1, n2 in F and not both zero such that 
(88) Aime = wim, = Marne, — 
For —e=)?—y?r= +7(u? Since —e is 


in F and ¢ is not in F we have \yjA:—pyet =0 as desired. If \1+0, (88): is 
satisfied by m for every of F and 


om — = [A2— (Ar = Ar — | =O 


so that (88) is completely satisfied. If \1:=0, then =A1A2=0 so 
that u1=0 and (88); is satisfied. Then (88) is satisfied for every 70 in F 
when we take 72= (u27)~!91A2. Hence finally let \1=y2=0. Then (88) is merely 
Him. =0 which is satisfied for any in F and by 7.=0. Also «+0 
so that, by (86), \=A2z and u = are not both zero, so that necessarily 7: =0. 

Consider now the problem of determining a general solution of (71). 
Suppose we have a solution and then put 


(89) ky = &1 + (€2i)u, ke = m1 + not, kg = X+ wu = Ryko. 


Equation (89) implies kski =\?—p?r = —e=hiki kok? = (£2 + £77) (n? —n? 7) 
and (86) is satisfied where 


(90) = Eym + = Eine + 


But is in F and, by the above lemma, = Also = £1, Az = E2727, 
so that =0, Aem—AaT = £27 2m) =0, and 
(88) is satisfied. Hence every solution of (71) defines a solution of (86) in K 
for which (87) and (88) are satisfied. 

Conversely let (86) be satisfied. By the above lemma, (87), (88) are satis- 
fied. Let m, m2 range over all solutions in F of (88), not both zero, and define 
hi, ke, ks by (89) so that if 


Ar + Ae + Met 
m1 + m + nu 


(Aim: — + — Arne) Aum — 
— ne — 


1 
is in F by (88). Also 


nt — — 


963 
| 
| 
| 
ks ‘ 
then | 
| 
m + | 
4 


964 A. A. ALBERT © 


and & is in F by (88). Hence (86) determines a set of solutions of (71) and 
we have proved 


THEOREM 14. Let F contain no quantity i such that i?=—1 and let «X0, 
range over all quantities of K =F (i) such that }*—p?r = —e, € is in F, and 
7r=1+€*, +7 is not the square of any quantity of F. Then if we determine all 
quantities m, nz satisfying (88) and define C=F(x) by (55)-(59), (65), (68) we 
obtain all cyclic fields C of degree eight over F. 


We therefore need only solve (86). This has already been accomplished in 
§6. Hence we have, without further proof, 


THEOREM 15. Let t range over all quantities of F such that +(1+#) is not 
the square of any quantity of F. Then if «= —d*, \=t or ti, u=0, we obtain a 
solution of *—y?r = —e and hence a set of cyclic fields of degree eight over F 
by the use of Theorem 14. 


Next utilize the proof of Theorem 11. If 2u=+2?, then either yu is in 
F and 2"? is in F or 2u=+ti, —t=2"7 is in F, (—2)”? is in F. Similarly 
if 2u=+2"% then again 2u=+4#, ti and either 2"? is in F or (—2)"? is 
in F, 

THEOREM 16. Let € range over all quantities of F such that +(1+¢)=+r 
is not the square of any quantity of F and let either 2"!* or (—2)!? be in F but 
i=(—1)"/? be not in F. Then if either (80) or (81) is satisfied and \ and p so 
defined in K =F(i) we obtain a set of cyclic fields of degree ei-'t over F by the 
use of Theorem 14. 


We finally use Theorem 12 to state immediately 


THEOREM 17. Let F contain no quantity i, i? = —1. Then every cyclic field 
of degree eight over F is a cyclic field of Theorems 13, 15, or 16 or is given by 
Theorems 9, 14, with (82) satisfied as #0, a0 range over all quantities of 
F(i) such that €is in F and +(1+ €)? is not the square of any quantity of F. 


UNIVERSITY OF CHICAGO, 
Curcaco, IL. 


{ 


ADDITION TO THE NOTE} 
ON SOME FUNCTIONALS{ 


BY 
STANISLAW SAKS 


7. We need to recall a few known definitions. 

Given an abstract space E (i.e., an arbitrary set of elements), a family € 
of sets in E is said to be additive if it satisfies the following conditions: 

(i) The empty set (0) belongs to &. 

(ii) If a set X belongs to G, its complement CX (with respect to the 
space £) also belongs to &. 

(iii) If {X,} is a sequence of sets belonging to €, the set X=)_X, also 
belongs to &. 

If F(X) is a finite real-valued function of sets, defined for all sets of an 
additive family &, and if 


(7.1) = 


for any finite sequence {X,} of sets of ©, of which no two have points in 
common, then F(X) is called an additive function of sets of €. If (7.1) holds 
for any finite or infinite sequence {X,} of sets belonging to €, of which no 
two have points in common, then F(X) is said to be a completely additive 
function of sets of &. 

In this paragraph we assume that %* is an additive family in the space E, 
and p(X) 20 is a completely additive and finite-valued function of sets of R*. 
The sets X belonging to §* are called measurable, u(X) being the measure 
of X. A measurable set X is a singular set if for any measurable subset Y of 
X either =0 or p(X —Y) =0. 

An additive function F(X) of measurable sets is absolutely continuous 
if F(X) =0 whenever X is of measure zero. This together with the pro- 
perty of being completely additive, is equivalent to the statement that for 
any ¢>O there exists an 7>0 such that u(X) <n implies | F(X)| <e. 

The family ®R* of measurable sets may be regarded as a metric complete 
space with the distance defined by § 

Tt This volume, pp. 549-556. In the present addition we extend the results of §2 to completely 
additive functions of sets in an abstract space. The author is indebted to Professor Tamarkin for 
criticisms. 

t Presented to the Society, April 14, 1933; received by the editors February 16, 1933. 

§ This definition corresponds to that of distance in the space R of characteristic functions of §2. 


965 


| 
| 
} 
| 
= | 
| 
| 


966 STANISLAW SAKS [October 
(7.2) d(X1, X2) = — X1X2) + w(X2 — 


If two measurable sets differ by subsets of measure zero they are regarded 
as the same elements of the space t*. Any completely additive and absolutely 
continuous function of measurable sets may be regarded as a continuous 
functional on the metric space #*. 


Lemma 1. /f A is a measurable set of positive measure, then, for any positive 
number ¢, the set A contains either a singular set of measure > or a measurable 
set of positive measure Se. 


Suppose that A contains neither a singular set of measure >e, nor a 
measurable set of positive measure <«. Then there will exist a measurable 
subset A; of A such that 0<y(A;) <u(A). The set A—A; must be a non- 
singular set of measure >e, and, by the same argument, A — A, contains a 
measurable subset Az such that 0<y(A2) <u(A —A;). By repeating this pro- 
cess we obtain an infinite sequence of measurable sets {A,} of positive meas- 
ure, of which no two have points in common. Since the series 

n=1 n=1 
converges, for m sufficiently large, we have 0<y(A,) <e. This, however, con- 
tradicts the assumption that A contains no measurable set of measure Se. 


Lemma 2. Given an arbitrary number e>O, the space E may be expressed 
as the sum of a finite number of measurable sets E,, Ex, - - - , E, such that EE; 
=0 for ix}, while each E; is either a singular set or a set of measure Se. 


We observe that for an arbitrary pair of singular sets, either one of them 
contains the other, with the possible exception of a set of measure zero, or 
else their common part is of measure zero. Since u(EZ) <0, on the basis of 
this remark we can find a finite sequence of singular sets Ei, E2,---, En 
of measure > such that 
(7.3) E;E; = 0 for i ¥ j, 
while the set 


(7.4) 


contains no singular set of measure > e. 

Let X be any measurable set and let A(X) denote the least upper bound 
of the measures of all measurable subsets Y of X such that u(Y) Se. It fol- 
lows from Lemma 1 that 0 <A(X) S for any measurable set X ¢ A of positive 
measure. Hence, by induction, we can determine a sequence { X;} of measur- 
able subsets of A such that 


| 
{ 


1933] ON SOME FUNCTIONALS 
(7.5) X;X; = Ofori ¥ j, 


n 


(7.6) € = w(Xny1) (4 


Upon putting 
Xo =A-— x; 
from (7.6) we have 


i=l 


Since, by (7.5), 


(7.8) Su(A)<o, 


the series (7.8) converges and lim, u(X,) =0. Thus we infer from (7.7) that 
(Xo) =0, whence also u(Xo) =0. Let now & be a positive integer such that 


(7.9) ( X,)= WX) 


n=h+1 
and let 
= Xi, Em+h Xn; Em+nti = Xot+ 


nwh+1 
These sets, by (7.6) and (7.9), are of measure Se, and by (7.5) no two of 
them have points in common. Hence the sequence £;, Ex,---, Eminsi 
satisfies the conditions of Lemma 2. 
8. We now are able to generalize Theorems 1 and 2 of §2. 


Tueorem 5. Let {F,(X)} bea sequence of completely additive and absolutely 
continuous functions of measurable sets. If this sequence converges for any set 
belonging to a class of the second category in the space t*, then the functions 
F(X) are equally absolutely continuoust and the sequence {F,(X)} converges 
for any measurable set Xc E—(E,+E2+ --- +Em) where {E;} is a finite 
sequence of singular sets. 

Consequently, if {F,,(X)} converges for any measurable set X, the limit func- 
tion is again a completely additive and absolutely continuous function of meas- 
urable sets in E. 


t That is, to every e>0 there corresponds an 7>0 which depends only on e, such that | Fn(X)| Se 
for n=1, 2,- ++ and for any set X of measure S7. 


| 

| 

| 
| 
| 

| 


968 STANISLAW SAKS [October 


The fact that the functions F(X) are equally absolutely continuous can 
be established in exactly the same fashion as in Theorem 1, §2, if we in- 
terpret the functions F(X) as continuous functionals in the metric complete 
space ®*. Now, since by assumption the sequence {F,(X)} converges for 
any X belonging to a set of the second category in §*, there exists in R* a 
sphere, say &(Ao; 7), such that {F,(X)} converges for each X of a set every- 
where dense in (Ao; 7). But the functionals F,,(X) are equally continuous 
in ®*, hence the sequence {F,(X)} converges everywhere in the sphere 
R(Ao; 7). 

Now let 

P 
E= DE; 
i=] 
be a representation of the space E mentioned in Lemma 2. We may assume 
that the sets Ez, - - - , Em are singular while the sets En4:, - - - , E, are 


of measure Sr. 
Let X be an arbitrary measurable set contained in >-?,,, E;. Then 


(8.1) X = SXE,. 


t=m+1 
Each set XE;, i=m+1,---, p, is of measure <r. Consequently the sets 
Aot+ XE; and Ap—AoXE;, i=m+1,---, p, are elements of the sphere 
R(Ao;r) and both sequences {F,(4o+XE,)}, {F.(Ao—AoXE,)} converge. 
Thus the sequence 
F,(XE;) = F,(Ao + XE) — Fn(Ao — AoXE,) 

also converges for i=m-+1, ---, p. Hence, by (8.1), the sequerce {F,(X) } 
converges for any measurable set X contained in E—(E,+ - - -+En) 
where - - - , Em are singular sets. 

Tueorem 6. If {F,(X)} is a sequence of completely additive and absolutely 
continuous functions of measurable sets and if 


(8.2) lim | F,(X)| < 


for any set X belonging to a class of the second category in the space R*, then 
there exists a fixed constant M such that 


(8.3) |F,(X)| M 


for any measurable sete XC E-(E:+ +Em) where {E;} is a finite se- 
quence of singular sets in E. 

Consequeatly, if the inequality (8.2) holds for every measurable set X, there 
exists a constant M such that (8.3) holds for all measurable sets X in E. 


} 
i 

| 

n 


1933] ON SOME FUNCTIONALS 


Let 2,* be the aggregate of sets X such that 
| Fx(X)| 


By assumption the class }°;° 9t,* is of the second category in the space R*. 

By the continuity of the functionals F,(X), the sets 9,* are closed (in the 

space Hence, for some value k = ko, contains a sphere, say ®(Ao; 7). 
We now introduce the same representation of the space 


Pp 
E= DE; 


as in the proof of Theorem 5. Let X be an arbitrary measurable set. Since, 
fori=m+1,---, p, the sets XZ; are of measure Sr, the sets Ay and 
Ay—AoXE; belong to the sphere 7). Thus 


| Fa(Ao + XE;)| S ho, | Fn(Ao — AoXE,)| S ho, 
and 
| Fa(XE,)| = | + — Fn(Ao — AoXE,)| S 2ko. 
Hence, for any measurable set X c E—(E,+ - - - +£,) we have 


| = [Fol < 2(p—m)ke, 


imm+1 
which completes the proof of Theorem 6. ; 

9. Theorems 5 and 6 contain the corresponding two theorems which have 
been stated recently by Nikodym.f 

I. If € is an additive family of sets in an abstract space £, and if the 
sequence {F,(X)} of completely additive functions of sets of © converges 
for every set X of G, then the limit function is also a completely additive func- 
tion of sets of &. 

II. If € is an additive family of sets in E and if the sequence {F,(X)} of 
completely additive functions of sets of € is bounded for every set X of &, 
then there exists a constant M such that | F,(X)| <M for n=1, 2, - - - and 
for all Xc G&. 

In order to reduce these theorems to Theorems 5 and 6 respectively we 
merely have to introduce a measure u(X) for the family ©, with respect to 
which the functions F,,(X) would be absolutely continuous. This can be 
achieved by putting, for each set Xc G, 

+ O. Nikodym, Sur les suites des fonctions parfaitement additives d’ensembles abstraits, Comptes 


Rendus, vol. 192 (1931), pp. 727-728. The proofs of the results stated in that note will appear in the 
Monatshefte fiir Mathematik und Physik. 


969 
(n = 1,2,---). 


970 STANISLAW SAKS 


V(X) 

where V,(X) denotes the absolute variation of F,(X) on the set X. Since 
each V,(X) is a non-negative and completely additive function of sets of € 
the series (9.1) converges and u(X) 20 is a completely additive and finite- 
valued function of sets of €. Hence u(X) may be taken as a measure in E 
and, since F,(X) =0, n=1, 2, - - - , for every set X of € such that u(X) =0, 
the functions F,,(X) are absolutely continuous with respect to this measure. 
Thus the theorems of Nikodym are reduced to our Theorems 5 and 6. 


t See for instance H. Hahn, Theorie der reellen Funktionen, 1921, Chapter VI. The absolute vari- 
ation is called there (p. 400) “absolute Summe.” 


UNIVERSITY OF WARSAW, 
Warsaw, POLAND 


i 
\ 
{ 


A SECOND CORRECTION 


BY 
EDWARD V. HUNTINGTON 


In my paper in the present volume these Transactions (vol. 35, pp. 274- 
304 and 557-558), the Example 5 on page 304 is erroneous and Postulate 5 
on page 301 is redundant. That is, Postulates 1, 2, 3, 4, 6, 7 (without 5) form 
a set of independent postulates for the “informal” system of Principia Mathe- 
matica. 

The proof of 5 from 1, 2, 3, 4, 6, 7 is as follows.* 

6a. If a+b is in T and a not in T, then b is in T. (From 6.) 

6b. If a not in T and b not in T, then a+b not in T. (From 6.) 

7a. If ais in T, then a’ is not in T. (From 7.) 

3a. If bis in T, then a+b is in T. 

For, by 7a, 5’ is not in T. But by 3, 6b’+(a+4) is in T. Hence by 6a, a+b 
is in T. 

4a. If bis in T, then b+a is in T. 

For, by 3a, a+6 is in T, whence by 7a, (a+6)’ is not in T. But by 4, 
(a+b)’+(b+a) is in T. Hence by 6a, b+< is in T. 

5a. If ais not in T, then a’ is in T. 

For, suppose a’ not in T. Then by 6b, a’+a not in T, whence by 6b, 
a’+(a’+a) not in T, contrary to 3. 

5. If a, b, etc. are in K, then (b'+c)’+[(a+b)’+(a+c) ] is in T. 

Case 1; a in T. By 4a, a+c is in J. Hence the theorem, by 3a (twice). 

Case 2:bin T. By 7a, b’ is not in T. If c is in T, then by 3a, a+c is in T, 
whence the theorem, by 3a (twice). If c is not in T, then by 6b, b’+<c is not 
in T, whence by Sa, (b’+c)’ isin T, whence the theorem, by 4a. 

Case 3: a not in T and b not in T. By 6b, a+ not in T, whence by Sa, 
(a+6)’ is in T. Hence the theorem, by 4a and 3a. 

The proof is thus complete. It can also be shown that 1, 2, 3a, 4a, 5a, 6a, 7 
form a set of independent postulates equivalent to the set 1, 2, 3, 4, 6, 7. 


* For valuable suggestions in this connection I am indebted to Professor Alonzo Church and 
Dr. K. E. Rosinger. 


HARVARD UNIVERSITY, 
CAMBRIDGE, Mass. 
May 29, 1933. 


CORRECTION TO A PAPER ON THE MOORE-KLINE 
PROBLEM* 


BY 
LEO ZIPPIN 


It has been brought to my attention by Mr. N. E. Steenrod that the 
lemma of page 708 in the paper referred to is in error. The final assertion 
of the proof is false. It is therefore necessary to point out that the paper 
is not “disturbed” by this fault. For if one requires that the P,, n=1,2,---, 
of the lemma be arcs then the (altered) lemma does hold, since it is true 
that the connected sum of a perfect continuous curve and an arc is “perfect.” 
One verifies that this restricted lemma is sufficient for the uses of the paper. 


* Published under that title, these Transactions, vol. 34, pp. 705-721. 


PRINCETON, UNIVERSITY, 
Princeton, N. J. 


\ 


| | 
| 
‘il 
| 
972 


